Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare the SB_MAC16 DSP component for the FOMU #611

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gcm-explo1t
Copy link

For a challenge at CSCG and ALLESCTF I used the SB_MAC16 DSP. Maybe it will also be useful for somebody else in the future.

Copy link
Collaborator

@umarcor umarcor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. However, I am concerned with the types of the generics. The ones used by Lattice are not always intuitive, and there are some differences between Radiant and Yosys. For instance:

generic ( 
  NEG_TRIGGER              : bit :='0'; 
  C_REG                    : bit :='0'; -- C0
  A_REG                    : bit :='0'; -- C1 
  B_REG                    : bit :='0'; -- C2
  D_REG                    : bit :='0'; -- C3
  TOP_8x8_MULT_REG         : bit :='0'; -- C4
  BOT_8x8_MULT_REG         : bit :='0'; -- C5
  PIPELINE_16x16_MULT_REG1 : bit :='0'; -- C6
  PIPELINE_16x16_MULT_REG2 : bit :='0'; -- C7
  TOPOUTPUT_SELECT         : bit_vector(1 downto 0) := "00" ;  -- COMB, ACCUM_REG, MULT_8x8, MULT_16x16  // {C9,C8}   = 00, 01, 10, 11
  TOPADDSUB_LOWERINPUT     : bit_vector(1 downto 0) := "00" ;  -- DATA, MULT_8x8, MULT_16x16, SIGNEXT    // {C11,C10} = 00, 01, 10, 11
  TOPADDSUB_UPPERINPUT     : bit                    := '0'  ;  -- ACCUM_REG, DATAC                       // C12 = 0, 1
  TOPADDSUB_CARRYSELECT    : bit_vector(1 downto 0) := "00" ;  -- LOGIC0,LOGIC1,ACCUMCI,GENERATED_CARRY  // {C14, C13} = 00, 01, 10, 11
  BOTOUTPUT_SELECT         : bit_vector(1 downto 0) := "00" ;  -- COMB, ACCUM_REG, MULT_8x8, MULT_16x16  // {C16,C15}  = 00, 01, 10, 11
  BOTADDSUB_LOWERINPUT     : bit_vector(1 downto 0) := "00" ;  -- DATA, MULT_8x8, MULT_16x16, SIGNEXTIN  // {C18,C17}  = 00, 01, 10, 11
  BOTADDSUB_UPPERINPUT     : bit                    := '0'  ;  -- ACCUM_REG, DATAD                       //  C19 = 0, 1
  BOTADDSUB_CARRYSELECT    : bit_vector(1 downto 0) := "00" ;  -- LOGIC0, LOGIC1, ACCUMCI, CI            //  {C21, C20}=00,01,10,11
  MODE_8x8                 : bit                    := '0'  ;  -- C22 
  A_SIGNED                 : bit := '0' ;  -- C23
  B_SIGNED                 : bit := '0'    -- C24
);

Therefore, I wonder why did you use integer, instead of bit and bit_vector. Are ghdl-yosys-plugin and Yosys expecting integers? Did you use this component declaration with the toolchains used in the CI of this repo?

IMHO, it would be desirable to enhance this PR with a minimal VHDL design which uses the DSP. It does not need to be any elaborated nor useful application, just something that allows to run the tools and check that the DSP is properly instantiated. For instance, extend the "blink" example so that a varying K signal is multiplied by the duty-cycle of the PWMs. The "variation" might be just periodically switching between two values. So, have a multiplexer with two constants and a low frequency square signal changing the select.

@gcm-explo1t
Copy link
Author

Overall, looks good. However, I am concerned with the types of the generics. The ones used by Lattice are not always intuitive, and there are some differences between Radiant and Yosys. For instance:

generic ( 
  NEG_TRIGGER              : bit :='0'; 
  C_REG                    : bit :='0'; -- C0
  A_REG                    : bit :='0'; -- C1 
  B_REG                    : bit :='0'; -- C2
  D_REG                    : bit :='0'; -- C3
  TOP_8x8_MULT_REG         : bit :='0'; -- C4
  BOT_8x8_MULT_REG         : bit :='0'; -- C5
  PIPELINE_16x16_MULT_REG1 : bit :='0'; -- C6
  PIPELINE_16x16_MULT_REG2 : bit :='0'; -- C7
  TOPOUTPUT_SELECT         : bit_vector(1 downto 0) := "00" ;  -- COMB, ACCUM_REG, MULT_8x8, MULT_16x16  // {C9,C8}   = 00, 01, 10, 11
  TOPADDSUB_LOWERINPUT     : bit_vector(1 downto 0) := "00" ;  -- DATA, MULT_8x8, MULT_16x16, SIGNEXT    // {C11,C10} = 00, 01, 10, 11
  TOPADDSUB_UPPERINPUT     : bit                    := '0'  ;  -- ACCUM_REG, DATAC                       // C12 = 0, 1
  TOPADDSUB_CARRYSELECT    : bit_vector(1 downto 0) := "00" ;  -- LOGIC0,LOGIC1,ACCUMCI,GENERATED_CARRY  // {C14, C13} = 00, 01, 10, 11
  BOTOUTPUT_SELECT         : bit_vector(1 downto 0) := "00" ;  -- COMB, ACCUM_REG, MULT_8x8, MULT_16x16  // {C16,C15}  = 00, 01, 10, 11
  BOTADDSUB_LOWERINPUT     : bit_vector(1 downto 0) := "00" ;  -- DATA, MULT_8x8, MULT_16x16, SIGNEXTIN  // {C18,C17}  = 00, 01, 10, 11
  BOTADDSUB_UPPERINPUT     : bit                    := '0'  ;  -- ACCUM_REG, DATAD                       //  C19 = 0, 1
  BOTADDSUB_CARRYSELECT    : bit_vector(1 downto 0) := "00" ;  -- LOGIC0, LOGIC1, ACCUMCI, CI            //  {C21, C20}=00,01,10,11
  MODE_8x8                 : bit                    := '0'  ;  -- C22 
  A_SIGNED                 : bit := '0' ;  -- C23
  B_SIGNED                 : bit := '0'    -- C24
);

Therefore, I wonder why did you use integer, instead of bit and bit_vector. Are ghdl-yosys-plugin and Yosys expecting integers? Did you use this component declaration with the toolchains used in the CI of this repo?

IMHO, it would be desirable to enhance this PR with a minimal VHDL design which uses the DSP. It does not need to be any elaborated nor useful application, just something that allows to run the tools and check that the DSP is properly instantiated. For instance, extend the "blink" example so that a varying K signal is multiplied by the duty-cycle of the PWMs. The "variation" might be just periodically switching between two values. So, have a multiplexer with two constants and a low frequency square signal changing the select.

Thx for the feedback. I used integer, as this was the only way to get it compiled on the current toolchain (with the provided ghdl-yosys-plugin and Yosys etc.). I have a blink example that uses the DSP, but in a very unintuitive way, as it was a reversing challenge for a CTF. I attached the challenge file.
challenge.zip

@umarcor
Copy link
Collaborator

umarcor commented Jan 23, 2022

I find your use case pretty exciting! While I check the design... would you mind providing some more context?

  • Is the challenge public? It'd be nice to have some URL to the context where Fomu and the DSP were used. The main purpose of this repository is learning/explaining, so that kind of references are interesting.
  • Why did you use FOMU? At first, I thought you'd be using it as a RISC-V CPU with some accelerator/co-processor, but it seems the challenge is based on hardware/RTL only.
  • Did you setup the challenge? I see the README is signed by you.

Overall, the concept of having a FOMU as a service, with a queue that users can submit bitstreams to have tested is amazing. I know that @mithro has been willing to provide a similar solution for other boards. Therefore, all the information you can provide is really helpful.

Naturally, this is mostly curiosity. It's ok if you want to stick to the code contribution.

@gcm-explo1t
Copy link
Author

Well for context, I'm part of NFITS which is organizing a competition for young IT-Security talents in Germany. I'm also part of a CTF-Team called ALLES!, which is also organizing a yearly competition. So for both of these events, I wrote some FPGA challenges, as this topic is rather rare in the IT-Seucurity domain. I already had contact with @mithro, which enabled me to give the final participants of the CSCG real FOMUs, and make them hack with real hardware, which was very awesome. However, for the ALLES! event, there were way too many participants, so I had to create a queue system to deploy their solutions on real hardware.

Is the challenge public? It'd be nice to have some URL to the context where Fomu and the DSP were used. The main purpose of this repository is learning/explaining, so that kind of references are interesting.

This in summary is the frame of the challenge. The challenge is partly public, as we used them in the events and there are some writeups online available. Soon we will release all the challenges fully open source, I could then link this challenge in the fomu-workshop repository.

Why did you use FOMU? At first, I thought you'd be using it as a RISC-V CPU with some accelerator/co-processor, but it seems the challenge is based on hardware/RTL only.

Well FOMUs are one of the cheapest real FPGAs and I already knew it from the c3 event, as @mithro handed me one there. I wanted to do some FPGA/VHDL stuff, as there is the least tooling already available and the people have to teach themself this new topic and really look into this stuff to get a hold of it, without using X generic solve tool.

Did you setup the challenge? I see the README is signed by you.

Yes a colleague and I are the authors of the challenges.

Overall, the concept of having a FOMU as a service, with a queue that users can submit bitstreams to have tested is amazing. I know that @mithro has been willing to provide a similar solution for other boards. Therefore, all the information you can provide is really helpful.

Well, this queue system was not for a bitstream, but for button inputs on the FOMU. As you can see in the exmple.txt file of the above-attached ZIP. However, it would not be difficult to change this to a full bitstream queue, if that would be interesting.

@gcm-explo1t
Copy link
Author

gcm-explo1t commented Jan 23, 2022

To add to this, in the remote setup, the FOMU buttons were hooked to some GPIOs of a raspberry pi and the pi had a camera recording the LED of the FOMU. So when you submitted a solution for the challenge, you sent a txt file to the queue server, then the server gave this task to one of the available PIs. The PI then flashed the bitstream onto the FOMU and ran the txt input, which enabled or disabled the GPIOs of the PI hooked up at the FOMU buttons and sleeps in between button toggles. Then, when the input was over, the PI sent the video to the scheduling server and there server sent it to the participating Team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants