Problem during CNN synthesis

Created by: MagiPrince

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

[x ] Test that the bug appears on the current version of the master branch. Make sure to include the commit hash of the commit you checked out.
[ x] Check that the issue hasn't already been reported, by checking the currently open issues.
[ x] If there are steps to reproduce the problem, make sure to write them down below.
If relevant, please include the hls4ml project files, which were created directly before and/or after the bug.

Quick summary

Hi everyone,

I'm trying to build my CNN model quantized, but when I set the filters to 32 for some QConv2d layers, the Concatenation function returns an error and the pre-synthesis failed.

I figure out that it was the Concatenation function that was triggering the error, by removing it from the model, but I don't understand why it raises an error since the function is supposed to concatenate only Tensors of size (None, 2)...

Details

My model :

import numpy as np
import tensorflow as tf
from keras.initializers import glorot_uniform
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, concatenate, Concatenate, Reshape, AveragePooling2D, BatchNormalization, Activation, Add, GlobalAveragePooling2D
from keras.models import Model
from qkeras import *
import contextlib

bits = 16

def basic_block(x, filters, strides=(1, 1)):
    # First convolution layer
    x = QConv2D(filters, (3, 3), strides=strides, padding='same',
                kernel_quantizer=quantized_bits(bits, 6, alpha=1),
                bias_quantizer=quantized_bits(bits, 6, alpha=1),
                kernel_initializer='lecun_uniform', use_bias=True)(x)
    x = QBatchNormalization()(x)
    x = QActivation(quantized_relu(bits, 6))(x)
    
    # Second convolution layer
    x = QConv2D(filters, (3, 3), strides=(1, 1), padding='same',
                kernel_quantizer=quantized_bits(bits, 6, alpha=1),
                bias_quantizer=quantized_bits(bits, 6, alpha=1),
                kernel_initializer='lecun_uniform', use_bias=True)(x)
    x = QBatchNormalization()(x)
    x = QActivation(quantized_relu(bits, 6))(x)
    
    return x

def qresnetModelWithLocalization(num_objects):
    # Input layer
    input1 = Input(shape=(64, 64, 3))

    # Initial convolution layer
    x = QConv2D(4, (7, 7), strides=(2, 2), padding='same',
                kernel_quantizer=quantized_bits(bits, 6, alpha=1),
                bias_quantizer=quantized_bits(bits, 6, alpha=1),
                kernel_initializer='lecun_uniform', use_bias=True)(input1)
    x = QBatchNormalization()(x)
    x = QActivation(quantized_relu(bits, 6))(x)
    
    # Residual blocks
    x = basic_block(x, 4)
    x = basic_block(x, 4)
    x = basic_block(x, 8, strides=(2, 2))
    x = basic_block(x, 8)
    x = basic_block(x, 16, strides=(2, 2))
    x = basic_block(x, 16)
    x = basic_block(x, 32, strides=(2, 2))
    x = basic_block(x, 32)
    
    x = Flatten()(x)
    
    output_1 = QDense(2, kernel_quantizer= quantized_bits(bits, 0, alpha=1),
                        bias_quantizer=quantized_bits(bits, 0, alpha=1),
                        kernel_initializer='lecun_uniform', use_bias=True)(x) # Output : x, y
    concatenated_outputs = QActivation(quantized_relu(bits, 6))(output_1)
    for _ in range(num_objects-1):
        output_tmp = QDense(2, kernel_quantizer= quantized_bits(bits, 0, alpha=1),
                        bias_quantizer=quantized_bits(bits, 0, alpha=1),
                        kernel_initializer='lecun_uniform', use_bias=True)(x) # Output : x, y
        output = QActivation(quantized_relu(bits, 6))(output_tmp)
        concatenated_outputs = Concatenate(axis=-1)([concatenated_outputs, output])
        print(concatenated_outputs.shape)


    reshaped_outputs = Reshape((num_objects, 2))(concatenated_outputs)

    # Create the model
    model = Model(inputs=input1, outputs=reshaped_outputs)

    return model

My hls4ml script :

import hls4ml
import tensorflow as tf
import numpy as np
from qresnet_and_mlp import qresnetModelWithLocalization
import keras
import os

os.environ['PATH'] += os.pathsep + '/tools/Xilinx/Vitis_HLS/2023.1/bin'

model = qresnetModelWithLocalization(2)

model.summary()

config = hls4ml.utils.config_from_keras_model(model, granularity="name")

# Set the precision and reuse factor for the full model
config['Model']['Precision'] = 'ap_fixed<16,6>'
config['Model']['ReuseFactor'] = 4000
config['Model']['Strategy'] = 'Resource'

for layer in config['LayerName'].keys():
    config['LayerName'][layer]['Strategy'] = 'Resource'
    config['LayerName'][layer]['ReuseFactor'] = 4000

cfg = hls4ml.converters.create_config(backend='Vitis')
cfg['IOType'] = 'io_stream'  # Must set this if using CNNs!
cfg['HLSConfig'] = config
cfg['KerasModel'] = model
cfg['OutputDir'] = 'model_1/'
cfg['Part'] = 'xc7z030sbv485-3'
cfg['Interface'] = 'axi_stream'

hls_model = hls4ml.converters.keras_to_hls(cfg)

hls4ml.utils.plot_model(hls_model, show_shapes=True, show_precision=True, to_file=None)

hls_model.compile()

# Use Vivado HLS to synthesize the model
# This might take several minutes
hls_model.build(csim=False, synth=True)

Error returned :

ERROR: [HLS 214-256] in function 'myproject(hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 3u>, 0>&, hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 2u>, 0>&)': Unsupported aggregate pragma/directive on variable 'layer80_cpy1' as the bit-width after aggregation (8192) is larger than 4096 (firmware/myproject.cpp:374:28)
ERROR: [HLS 214-256] in function 'myproject(hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 3u>, 0>&, hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 2u>, 0>&)': Unsupported aggregate pragma/directive on variable 'layer80_cpy2' as the bit-width after aggregation (8192) is larger than 4096 (firmware/myproject.cpp:376:25)

Steps to Reproduce

Add what needs to be done to reproduce the bug. Add commented code examples and make sure to include the original model files / code, and the commit hash you are working on.

Use the last hls4ml version and Vitis HLS 2023.1
Run the hls script

Expected behavior

Synthetize the model

Actual behavior

Raises an error