Problem during CNN synthesis
Created by: MagiPrince
Prerequisites
Please make sure to check off these prerequisites before submitting a bug report.
- [x ] Test that the bug appears on the current version of the master branch. Make sure to include the commit hash of the commit you checked out.
- [ x] Check that the issue hasn't already been reported, by checking the currently open issues.
- [ x] If there are steps to reproduce the problem, make sure to write them down below.
-
If relevant, please include the hls4ml project files, which were created directly before and/or after the bug.
Quick summary
Hi everyone,
I'm trying to build my CNN model quantized, but when I set the filters to 32 for some QConv2d layers, the Concatenation function returns an error and the pre-synthesis failed.
I figure out that it was the Concatenation function that was triggering the error, by removing it from the model, but I don't understand why it raises an error since the function is supposed to concatenate only Tensors of size (None, 2)...
Details
My model :
import numpy as np
import tensorflow as tf
from keras.initializers import glorot_uniform
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, concatenate, Concatenate, Reshape, AveragePooling2D, BatchNormalization, Activation, Add, GlobalAveragePooling2D
from keras.models import Model
from qkeras import *
import contextlib
bits = 16
def basic_block(x, filters, strides=(1, 1)):
# First convolution layer
x = QConv2D(filters, (3, 3), strides=strides, padding='same',
kernel_quantizer=quantized_bits(bits, 6, alpha=1),
bias_quantizer=quantized_bits(bits, 6, alpha=1),
kernel_initializer='lecun_uniform', use_bias=True)(x)
x = QBatchNormalization()(x)
x = QActivation(quantized_relu(bits, 6))(x)
# Second convolution layer
x = QConv2D(filters, (3, 3), strides=(1, 1), padding='same',
kernel_quantizer=quantized_bits(bits, 6, alpha=1),
bias_quantizer=quantized_bits(bits, 6, alpha=1),
kernel_initializer='lecun_uniform', use_bias=True)(x)
x = QBatchNormalization()(x)
x = QActivation(quantized_relu(bits, 6))(x)
return x
def qresnetModelWithLocalization(num_objects):
# Input layer
input1 = Input(shape=(64, 64, 3))
# Initial convolution layer
x = QConv2D(4, (7, 7), strides=(2, 2), padding='same',
kernel_quantizer=quantized_bits(bits, 6, alpha=1),
bias_quantizer=quantized_bits(bits, 6, alpha=1),
kernel_initializer='lecun_uniform', use_bias=True)(input1)
x = QBatchNormalization()(x)
x = QActivation(quantized_relu(bits, 6))(x)
# Residual blocks
x = basic_block(x, 4)
x = basic_block(x, 4)
x = basic_block(x, 8, strides=(2, 2))
x = basic_block(x, 8)
x = basic_block(x, 16, strides=(2, 2))
x = basic_block(x, 16)
x = basic_block(x, 32, strides=(2, 2))
x = basic_block(x, 32)
x = Flatten()(x)
output_1 = QDense(2, kernel_quantizer= quantized_bits(bits, 0, alpha=1),
bias_quantizer=quantized_bits(bits, 0, alpha=1),
kernel_initializer='lecun_uniform', use_bias=True)(x) # Output : x, y
concatenated_outputs = QActivation(quantized_relu(bits, 6))(output_1)
for _ in range(num_objects-1):
output_tmp = QDense(2, kernel_quantizer= quantized_bits(bits, 0, alpha=1),
bias_quantizer=quantized_bits(bits, 0, alpha=1),
kernel_initializer='lecun_uniform', use_bias=True)(x) # Output : x, y
output = QActivation(quantized_relu(bits, 6))(output_tmp)
concatenated_outputs = Concatenate(axis=-1)([concatenated_outputs, output])
print(concatenated_outputs.shape)
reshaped_outputs = Reshape((num_objects, 2))(concatenated_outputs)
# Create the model
model = Model(inputs=input1, outputs=reshaped_outputs)
return model
My hls4ml script :
import hls4ml
import tensorflow as tf
import numpy as np
from qresnet_and_mlp import qresnetModelWithLocalization
import keras
import os
os.environ['PATH'] += os.pathsep + '/tools/Xilinx/Vitis_HLS/2023.1/bin'
model = qresnetModelWithLocalization(2)
model.summary()
config = hls4ml.utils.config_from_keras_model(model, granularity="name")
# Set the precision and reuse factor for the full model
config['Model']['Precision'] = 'ap_fixed<16,6>'
config['Model']['ReuseFactor'] = 4000
config['Model']['Strategy'] = 'Resource'
for layer in config['LayerName'].keys():
config['LayerName'][layer]['Strategy'] = 'Resource'
config['LayerName'][layer]['ReuseFactor'] = 4000
cfg = hls4ml.converters.create_config(backend='Vitis')
cfg['IOType'] = 'io_stream' # Must set this if using CNNs!
cfg['HLSConfig'] = config
cfg['KerasModel'] = model
cfg['OutputDir'] = 'model_1/'
cfg['Part'] = 'xc7z030sbv485-3'
cfg['Interface'] = 'axi_stream'
hls_model = hls4ml.converters.keras_to_hls(cfg)
hls4ml.utils.plot_model(hls_model, show_shapes=True, show_precision=True, to_file=None)
hls_model.compile()
# Use Vivado HLS to synthesize the model
# This might take several minutes
hls_model.build(csim=False, synth=True)
Error returned :
ERROR: [HLS 214-256] in function 'myproject(hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 3u>, 0>&, hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 2u>, 0>&)': Unsupported aggregate pragma/directive on variable 'layer80_cpy1' as the bit-width after aggregation (8192) is larger than 4096 (firmware/myproject.cpp:374:28)
ERROR: [HLS 214-256] in function 'myproject(hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 3u>, 0>&, hls::stream<nnet::array<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, 2u>, 0>&)': Unsupported aggregate pragma/directive on variable 'layer80_cpy2' as the bit-width after aggregation (8192) is larger than 4096 (firmware/myproject.cpp:376:25)
Steps to Reproduce
Add what needs to be done to reproduce the bug. Add commented code examples and make sure to include the original model files / code, and the commit hash you are working on.
- Use the last hls4ml version and Vitis HLS 2023.1
- Run the hls script
Expected behavior
Synthetize the model
Actual behavior
Raises an error