FINN v0.10 released! #1026

auphelia · 2024-04-02T10:53:59Z

auphelia
Apr 2, 2024
Maintainer

FINN release v0.10

FINN v0.10 is finally here!
Over the last year, we have invested a lot of time refactoring FINN and implementing exciting new features.
As already indicated in the last release, we have continued to work on operator hardening (RTL variants of important HLS layers) and, in the process, made the custom layer integration more flexible. This is the most disruptive change in this release, and you can find more details in the section on Refactoring of Custom Operator Infrastructure.
In addition, we have updated the contributions guidelines. The Docker setup was updated to Ubuntu 22.04 and Python 3.10, and we recommend using FINN with Vivado/Vitis 2022.2. You can read more about this in this blog post.
But now let's talk about the improvements and highlights in more detail:

Refactoring of Custom Operator Infrastructure,
New RTL Components,
Accumulator Width and Weight Bit Width Minimization, and
Other Improvements and New Features.

Refactoring of Custom Operator Infrastructure

The FINN compiler was developed with the assumption that the hardware blocks corresponding to the neural network layers are developed based on HLS. Although we do not want to abolish this HLS implementation, it has become apparent over the years that for certain modules, it makes sense to implement them in RTL. This gives us greater control over the resulting hardware, and we can make optimal use of FPGA resources.
So, with the growth of more and more RTL variants of common FINN hardware building blocks, we decided to refactor the custom operator class structure and to modify the builder steps.

New Class Hierarchy

Previously, fpgadataflow nodes were derived from the HLSCustomOp class, which was derived from the CustomOp class coming from the qonnx toolkit. We have split the HLSCustomOp class into three classes:

HWCustomOp: containing all functionality that hardware (HW) custom ops have in common
HLSBackend: containing HLS specific functionality
RTLBackend: containing RTL specific functionality

Every fpgadataflow node now has up to three representations. Let’s have a look at an example:
The FMPadding node is used to implement padding in a convolution. With the new structure there are 3 Python classes related to FMPadding:

FMPadding: HW abstraction layer containing shared functionality of the HLS and RTL variants of FMPadding;
FMPadding_hls: HLS variant of FMPadding, corresponding to the finn-hlslib functional call; and
FMPadding_rtl: RTL variant of FMPadding, corresponding to the finn-rtllib module.

Here is a class diagram of the new fpgadataflow custom op class hierarchy
Please, note that not all layers have an HLS and RTL variant, sometimes they have only one variant.

Updated FINN Flow

Since the new class hierarchy introduced an additional layer of expressing the model (HW abstraction nodes), the previous step_convert_to_hls was replaced by step_convert_to_hw and now converts standard ONNX layers to HW abstraction layers. We then introduced an additional builder step called step_specialize_layers. In this step HW nodes get specialized to either an HLS or RTL variant of the node.
They get converted either based on pre-determined rules or the user provides a configuration file which contains the desired setting. If the user preference cannot be fulfilled, a warning will be printed and the implementation style will be set to the default.
You can learn more about how to use this step in the 4_advanced_builder_settings notebook. Thanks to @jmonks-amd, we have a guide on how to convert your current FINN flow to the new builder flow: #1020

New RTL Components

We are excited to announce that after providing RTL variants for the ConvolutionInputGenerator and the FMPadding component (thanks to @fpjentzsch and @maltanar), we now also offer optimized RTL implementations for the key layers: Thresholding and MatrixVectorActivation/VectorVectorActivation.
If you would like to find out more, please have a look at the dedicated Show & Tell posts about these components:

Thanks to @preusser, @azizb-xlnx, @mmrahorovic and @fionnodonohoe-xlnx for your great contributions on these features.

Accumulator Width and Weight Bit Width Minimization

The FINN building blocks have long been capable of automatically reducing the accumulator bit width for individual layers. With this release, we have improved FINN’s automated accumulator bit width reduction methods. We have also added a new method to automatically reduce the weight bit width of a layer based on known weight values (assuming the weights are not runtime-writeable). We have packaged both transformations into a new dataflow step: step_minimize_bit_width that can be easily inserted into any FINN flow!

Thanks a lot to @i-colbert for your contributions on this! This work was in context of our research about Accumulator-Aware quantization (A2Q) which is a new quantization-aware training technique that enables users to train models for a target accumulator bit width during inference Colbert et al., 2023; “A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance”.
This technique is fully integrated into the Brevitas framework and for an example on how to train a model for a specified accumulator bit width, please see this example in Brevitas. We are also working on an end-to-end example of how to use A2Q with FINN. So, please, stay tuned!

Other Improvements and New Features

Next to these highlights, we also invested in other improvements and new features. Please find a list below. We thank all contributors. You’ll find kudos in parentheses after each contribution identifying external contributors by their GitHub account names.

Transposed Convolution (@hleblevec),
SELU support,
Enablement of beta devices,
Improved compatibility between tool versions for Zynq Build and additional boards: RFSoC 4x2 board (@fpjentzsch), U55C (@LinusJungemann), Ultra96-v2(@rstar900),
Revised memstream component,
Additional tutorials:
o Tutorial about QONNX export and QONNX -> FINN-ONNX conversion (@heborras),
o Tutorial about Folding Factors (@shashwat1198),
o Tutorial about Advanced Builder Settings .

This has been a major release, and we would like to thank all contributors for their amazing work!
Have fun trying out the new flow and features!

The FINN Team

Arbiter-glitch · 2024-04-07T14:26:40Z

Arbiter-glitch
Apr 7, 2024

Does finn support models with input batchsize (N) >1 for CNNs?

1 reply

fpjentzsch Apr 8, 2024
Collaborator

Hi, traditional batching is not really applicable to the dataflow accelerators FINN generates. Please continue the discussion here: #1029

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FINN v0.10 released! #1026

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

FINN v0.10 released! #1026

auphelia Apr 2, 2024 Maintainer

FINN release v0.10

Refactoring of Custom Operator Infrastructure

New Class Hierarchy

Updated FINN Flow

New RTL Components

Accumulator Width and Weight Bit Width Minimization

Other Improvements and New Features

Replies: 1 comment · 1 reply

Arbiter-glitch Apr 7, 2024

fpjentzsch Apr 8, 2024 Collaborator

auphelia
Apr 2, 2024
Maintainer

Replies: 1 comment 1 reply

Arbiter-glitch
Apr 7, 2024

fpjentzsch Apr 8, 2024
Collaborator