Description

We introduce an hls4ml optimizer pass for merging the ReLU layer into the Dense/Conv2D layers when ReLU immediately follows them---a frequently encountered pattern in neural networks (NNs). NNs in hls4ml are spatially laid out using dataflow stages to implement each layer, which are linked together by FIFOs. These FIFOs can cost BRAMs, LUTs, and/or FFs. By default in hls4ml, each ReLU is implemented as its own dataflow stage. Because each additional dataflow stage costs extra logic and FIFOs, we reduce the resource utilization by merging the ReLU activation function into the layer preceding it. Although the layers with the newly merged ReLU functionality use more logic than before, there is still a net decrease in resources. This optimization was introduced in hls4ml's MLPerf TinyML Benchmark 2022 submission and written up in this paper. Resource reductions introduced by this optimization are reported in the paper.

Type of change

This optimization pass was first mentioned in the MLPerf TinyML PR #503.

New feature (non-breaking change which adds functionality)
A new research paper code implementation

Tests

This repo contains two test models (a fully-connected NN and a CNN) that can be trained on MNIST, converted into Vivado HLS, and synthesized using Vivado HLS 2020.1.

Checklist

I have read the guidelines for contributing.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation.
My changes generate no new warnings.
I have added tests that prove my fix is effective or that my feature works.

Relu merge optimizer pass

Description

Type of change

Tests

Checklist

Merge request reports