Precision in batch norm

Created by: benjaminkreis

@sergojin, @jmduarte, and I found that doing the batch norm computation and casting to res_T in one line leads to poorer agreement with floating point than expected.

Replacing this:

res[ires] = (res_T) (data[ires]-mean[norm_index])*scale[norm_index]+beta[norm_index]

with

ap_fixed<32,14> temp = (data[ires]-mean[ires])*scale[ires]+beta[ires];
res[ires] = (res_T) temp;

where ap_fixed<32,14> is some arbitrarily chosen high precision leads to good agreement between res and the floating point expectation. In other words, the casting is not just changing precision at the end of the calculation but is instead influencing the precision of the calculation itself (maybe not a surprise).

It's a small change, but we need to think of something better than just hardcoding a random high precision type for temp.