From 0f0ec1887d13082225f997268098511c7faa291f Mon Sep 17 00:00:00 2001 From: Duchstf <minhduc8199@gmail.com> Date: Wed, 12 May 2021 21:05:52 -0500 Subject: [PATCH 1/2] remote io_serial as io_stream and add some more info --- docs/api/configuration.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/api/configuration.rst b/docs/api/configuration.rst index 4707b4a12..bf9525f16 100644 --- a/docs/api/configuration.rst +++ b/docs/api/configuration.rst @@ -68,7 +68,7 @@ It looks like this: XilinxPart: xcku115-flvb2104-2-i ClockPeriod: 5 - IOType: io_parallel # options: io_serial/io_parallel + IOType: io_parallel # options: io_parallel/io_stream HLSConfig: Model: Precision: ap_fixed<16,6> @@ -91,7 +91,7 @@ There are a number of configuration options that you have. Let's go through the * **XilinxPart**\ : the particular FPGA part number that you are considering, here it's a Xilinx Virtex-7 FPGA * **ClockPeriod**\ : the clock period, in ns, at which your algorithm runs Then you have some optimization parameters for how your algorithm runs: -* **IOType**\ : your options are ``io_parallel`` or ``io_serial`` where this really defines if you are pipelining your algorithm or not +* **IOType**\ : your options are ``io_parallel`` or ``io_stream`` where this really defines if you are pipelining your algorithm or not. ``io_stream`` is used for CNN streaming architecture. For more information on streaming see `this PR <https://github.com/fastmachinelearning/hls4ml/pull/220>`__. * **ReuseFactor**\ : in the case that you are pipelining, this defines the pipeline interval or initiation interval * **Strategy**\ : Optimization strategy on FPGA, either "Latency" or "Resource". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" strategy. An example of using larger reuse factor can be found `here. <https://github.com/hls-fpga-machine-learning/models/tree/master/keras/KERAS_dense>`__ * **Precision**\ : this defines the precsion of your inputs, outputs, weights and biases. It is denoted by ``ap_fixed<X,Y>``\ , where ``Y`` is the number of bits representing the signed number above the binary point (i.e. the integer part), and ``X`` is the total number of bits. -- GitLab From f7704bb9bfc88e076d1032ca6dc27d3f67995a7b Mon Sep 17 00:00:00 2001 From: Javier Duarte <jduarte@ucsd.edu> Date: Sat, 18 Jun 2022 15:03:42 -0700 Subject: [PATCH 2/2] Update configuration.rst --- docs/api/configuration.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api/configuration.rst b/docs/api/configuration.rst index bf9525f16..d2511950e 100644 --- a/docs/api/configuration.rst +++ b/docs/api/configuration.rst @@ -91,7 +91,7 @@ There are a number of configuration options that you have. Let's go through the * **XilinxPart**\ : the particular FPGA part number that you are considering, here it's a Xilinx Virtex-7 FPGA * **ClockPeriod**\ : the clock period, in ns, at which your algorithm runs Then you have some optimization parameters for how your algorithm runs: -* **IOType**\ : your options are ``io_parallel`` or ``io_stream`` where this really defines if you are pipelining your algorithm or not. ``io_stream`` is used for CNN streaming architecture. For more information on streaming see `this PR <https://github.com/fastmachinelearning/hls4ml/pull/220>`__. +* **IOType**\ : your options are ``io_parallel`` or ``io_stream`` which defines the type of data structure used for inputs, intermediate activations between layers, and outputs. For ``io_parallel``, arrays are used that, in principle, can be fully unrolled and are typically implemented in RAMs. For ``io_stream``, HLS streams are used, which are a more efficient/scalable mechanism to represent data that are produced and consumed in a sequential manner. Typically, HLS streams are implemented with FIFOs instead of RAMs. For more information see `here <https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/pragma-HLS-stream>`__. * **ReuseFactor**\ : in the case that you are pipelining, this defines the pipeline interval or initiation interval * **Strategy**\ : Optimization strategy on FPGA, either "Latency" or "Resource". If none is supplied then hl4ml uses "Latency" as default. Note that a reuse factor larger than 1 should be specified when using "resource" strategy. An example of using larger reuse factor can be found `here. <https://github.com/hls-fpga-machine-learning/models/tree/master/keras/KERAS_dense>`__ * **Precision**\ : this defines the precsion of your inputs, outputs, weights and biases. It is denoted by ``ap_fixed<X,Y>``\ , where ``Y`` is the number of bits representing the signed number above the binary point (i.e. the integer part), and ``X`` is the total number of bits. -- GitLab