Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Residual structure cannot be converted to hls #67

Open
TATynise opened this issue Oct 13, 2023 · 1 comment
Open

Residual structure cannot be converted to hls #67

TATynise opened this issue Oct 13, 2023 · 1 comment

Comments

@TATynise
Copy link

Hi,I encountered "cycle-free graph violated: partition depends on itself" while running a custom network on finn.I have tried adjusting the streamlining and convert_to_hls steps according to ResNet-50 finn-example, but it still failed.

This is the residual part of the network:

image

Refer to "cnv_end2end_example",after streamline the residual part is as shown in the figure:

image

Refer to the "streamline nonlinear" step in ResNet50 finn-example, as shown in the figure:

image

Then converted to hls, as shown in the figure:

image
When finally using "parent_model = model.transform(CreateDataflowPartition())", it failed because the residual part was not converted successfully.I have tried many ways but nothing works, I hope you can provide some guidance.

Thanks.

@mmrahorovic
Copy link
Collaborator

Hi @TATynise,

Thanks for your question!

Residual networks are indeed a bit tricky since it requires a streamlining process that's relatively more involved compared to linear networks. It looks like the streamlining process didn't 'fully streamline' the graph -- meaning you have a few floating point operators left in your network. In the final image you showed, you can see that the Mul and Add nodes (which are regular ONNX node) are mixed with the so-called fpgadataflow nodes (FMPadding_Batch, ConvolutionInputGenerator). The CreateDataflowPartition transform will partition your model in smaller sub-models, where each sub-model will consists of (exclusively) nodes that are either standard ONNX nodes or fpgadataflow-type nodes (i.e. nodes that will in the end run on the FPGA). Since your network is residual and contains many of these regular ONNX node mixed with fpgadataflow-type nodes, the partitioning becomes more complicated and breaks along the way somewhere.

To resolve this, I would first suggest to revisit the streamlining of your network, since I presume your target is to run the full network on the FPGA rather than partly. One trick to make this easier, is to add uniform quantizers at the end of both residual lanes in your custom network (before exporting it with Brevitas). In the third image you showed, this would result in having a MultiThreshold node at the end of both lanes. These MultiThreshold nodes are essentially what allows us to streamline away floating point operators by moving them around and absorbing them in those MultiThreshold thresholds. By then calling transforms such as AbsorbAddIntoMultiThreshold and AbsorbMulIntoMultiThreshold, those floating point operators will be absorbed in the thresholds of the subsequent MultiThreshold node.

This would remove the floating point operators you showed in the screenshots and bring you one step closer to full FPGA execution. Hope this helps you further and please let us know if you run into further issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants