Trying to fully unfold small network #450

Sekijoju · 2021-11-22T18:32:26Z

Sekijoju
Nov 22, 2021

Hey all,

I am currently trying to fully unfold a small fully-connected network very similar to the tfc-w2a2 example but with reduced number of neurons. Building does not produce any errors, but the resulting stitched IP seems to be missing parts. According to the estimation the design should require 24721 LUT, the OOC synthesis shows only 242 LUT.
The cpp files have the right parameters for SIMD and PE set, so the problem is not there.
When applying folding to the layers PE, everything works as expected.
However, I require a fully unfolded network (except for the first layer).
Did anyone encounter a similar problem or knows a solution?

The network has 16 neurons in each layer and I set the folding config as shown below.

{
"Defaults": {},
"Thresholding_Batch_0": {
"PE": 49,
"ram_style": "distributed"
},
"StreamingFCLayer_Batch_0": {
"PE": 16,
"SIMD": 49,
"ram_style": "block"
},
"StreamingFCLayer_Batch_1": {
"PE": 16,
"SIMD": 16,
"ram_style": "auto"
},
"StreamingFCLayer_Batch_2": {
"PE": 16,
"SIMD": 16,
"ram_style": "auto"
},
"StreamingFCLayer_Batch_3": {
"PE": 10,
"SIMD": 16,
"ram_style": "distributed"
},
"LabelSelect_Batch_0": {
"PE": 1
}
}

EDIT: And this is the onnx for the network
The issue also appears for the provided tfc-w2a2 example when setting PE=64 for any layer. Convolutional layers seem fine

Answered by maltanar

Nov 23, 2021

Hi @Sekijoju -- can you clarify what you mean by "the resulting stitched IP has all its layers fully folded instead of unfolded"? Just to be sure: in FINN, the amount of parallelism inside each layer will not be visible at the IP block level. So even by going to max PE and SIMD, you will still see one IP block per layer, but internally each block will have a great deal of parallelism.

One extra recommendation I can give for fully unfolded FC layers is to use "ram_style" : "distributed" for all layers and "mem_mode" : "const". You may also get somewhat higher latency than expected, because FINN's HLS library of layers are optimized for some degree of folding and not full unfolding. We've s…

View full answer

maltanar · 2021-11-23T17:07:57Z

maltanar
Nov 23, 2021
Maintainer

Hi @Sekijoju -- can you clarify what you mean by "the resulting stitched IP has all its layers fully folded instead of unfolded"? Just to be sure: in FINN, the amount of parallelism inside each layer will not be visible at the IP block level. So even by going to max PE and SIMD, you will still see one IP block per layer, but internally each block will have a great deal of parallelism.

One extra recommendation I can give for fully unfolded FC layers is to use "ram_style" : "distributed" for all layers and "mem_mode" : "const". You may also get somewhat higher latency than expected, because FINN's HLS library of layers are optimized for some degree of folding and not full unfolding. We've some development in progress (the RTL MVAU) that should make this better in the near future.

3 replies

Sekijoju Nov 23, 2021
Author

Sorry, I had to rephrase and edit that part because I was mistaken about the folding.

So it seems the folding is properly applied (nf = 1 in my case), but something is going wrong.

In this image of a part of the floorplan the different layers leaf cells are marked in different colors.
The resulting network should be much larger than this, right?

maltanar Nov 23, 2021
Maintainer

It's a bit hard to tell what the footprint will be for fully-unfolded MLPs since logic synthesis can do a lot of extra optimizations (e.g. fine-grained sparsity/zero-valued weights will disappear). However, I also agree that what you shared above looks too small, so something may be going wrong. I would try this with "ram_style" : "distributed", "mem_mode" : "const" and see if that makes a difference. At the end of the day, you may need to do post-synthesis simulation of the netlist to make sure the functionality is still correct.

Sekijoju Nov 23, 2021
Author

Thank you for the input!

Only considering the floorplan I would say this looks fine, but I will also do some testing now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to fully unfold small network #450

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Trying to fully unfold small network #450

Sekijoju Nov 22, 2021

Replies: 1 comment · 3 replies

maltanar Nov 23, 2021 Maintainer

Sekijoju Nov 23, 2021 Author

maltanar Nov 23, 2021 Maintainer

Sekijoju Nov 23, 2021 Author

Sekijoju
Nov 22, 2021

Replies: 1 comment 3 replies

maltanar
Nov 23, 2021
Maintainer

Sekijoju Nov 23, 2021
Author

maltanar Nov 23, 2021
Maintainer

Sekijoju Nov 23, 2021
Author