Add support for HGQ proxy model
Created by: calad0i
Description
This PR adds support for a special layer type FixedPointQuantizer
used in calad0i/HGQ/hls4ml-integration to facilitate conversion from HGQ model to hls4ml model graph. ModelGraph converted from proxy-model is meant to be bit-accurate, until fp32
(or tf32
) cannot emulate the required precision during tensorflow model inference.
The layer has three major functions:
- When heterogeneous activation quantization is used, do value masking (with code generation). If not, the layer is removed.
- Overrides precision settings for other layers. All precision settings are embedded into these layers in a proxy model.
- Though not used in hls4ml, when
fp32
can represent the required precision, the proxy model can be used to emulate hls models' output, with or without overflows withSAT
orWRAP
(other overflow modes not tested).
In the future, the proxy model mechanism can also be used with Qkeras models, but this part is not yet implemented.
#912 could still break bit-accuracy for pooling layer with io_stream
. Temporary fix available at #917, superseded by #855 when available.
Independent of this PR, Conv2D
in Quartus with 3x3
filter size with io_parallel
seems to be broken at the moment, some tests are expected to fail at the moment. The cause seems to be related to the winograd implementation.
This PR dependents on #887 #906 #907 #908 #909 #911
Type of change
-
New feature (non-breaking change which adds functionality)
Tests
test/pytest/test_proxy_model.py
Test Configuration:
Requires models add in fastmachinelearning/example-models#11.
Checklist
-
I have read the guidelines for contributing. -
I have commented my code, particularly in hard-to-understand areas. -
I have made corresponding changes to the documentation. -
My changes generate no new warnings. -
I have installed and run pre-commit
on the files I edited or added. -
I have added tests that prove my fix is effective or that my feature works.