AveragePooling2D/GlobalAveragePooling2D Issue?

I've been tracking down bugs related to this TinyML ResNet model, and you can see here (https://github.com/fastmachinelearning/platform_ml_models/blob/master/eembc/CIFAR10_ResNetv1/resnet_v1_eembc.py#L126), it uses an AveragePooling2D layer with the pool_size parameter set to the whole image, so effectively it's a GlobalAveragePooling2D layer.

For some reason, I'm unable to get this type of layer to give correct results in hls4ml. See this gist to reproduce the error: https://gist.github.com/jmduarte/2bb7ccb8c3028056ef3bdd7f2579250b

I create a random input tensor with shape [8, 8, 3], which after the (global) average pooling should have output values of:

[17.974936   9.814308   4.8630896]

The hls4ml output instead is (erroneously) the same for each channel

[0.875 0.875 0.875]

This is using ap_fixed<10,7,AP_SAT,AP_RND> precision, and io_stream, but varying these things doesn't fix the agreement. GlobalAveragePooling2D shows the exact same behavior. Curiously, MaxPooling2D and GlobalMaxPooling2D seem to be correct and don't have this issue.