Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix Multiplication Improvements #1551

Merged
merged 20 commits into from
Jun 24, 2024
Merged

Matrix Multiplication Improvements #1551

merged 20 commits into from
Jun 24, 2024

Conversation

andrej
Copy link
Collaborator

@andrej andrej commented Jun 11, 2024

  • work around object fifo bug ObjectFifo - Assertion Error in Nested Loop with Only One Iteration #1547
  • in host verification code, accumulate in float instead of bf16 -- this allows verification to pass even for bigger problem sizes, where before the accumulation errors got too big
  • if problem size is beyond threshold, stochastically verify instead of exhaustively (exhaustively verifying takes way too long, which is probably why the failed verification for larger problem sizes went undetected before) -- if this is merged, we can close Stochastic matmul verify #1134 which contains a similar change
  • add assertions and comments
  • improve verification output
  • add workaround comment for excessive program size (CDO generation error [AIE ERROR] _XAie_LoadProgMemSection():230: Overflow of program memory e.g. for size M=768 K=768 N=256 if fifo_depth>1)

cc @jgmelber

@andrej
Copy link
Collaborator Author

andrej commented Jun 12, 2024

  • make use of changes in Changes to NpuDmaMemcpyNdOp and AIEDmaToNpu to support sub-word strides, offsets and sizes #1538 to simplify code (strides) for whole_array design
  • allow changing inner tile size (small m, k, n) for whole_array design
  • replace complicated verficiation code with the naive implementation; the idea was that the other implementation would be faster, but it was not (significantly); swapped the naive version back in to have easier maintainability and confidence in the code

@andrej
Copy link
Collaborator Author

andrej commented Jun 13, 2024

@jgmelber Kindly kick off the tests for this when you get a chance. I fixed a number of issues with the matmuls that slid by because the verification tolerance was a bit high. I reduced the tolerance, so let's see if the tests reveal more issues.

@jgmelber jgmelber linked an issue Jun 13, 2024 that may be closed by this pull request
@andrej
Copy link
Collaborator Author

andrej commented Jun 13, 2024

@jgmelber Could you please kick off the CI again? TIA

@jgmelber
Copy link
Collaborator

@andrej matrix_vector still needs to be fixed, the rest look okay

@andrej
Copy link
Collaborator Author

andrej commented Jun 14, 2024

@jgmelber Hopefully this will pass now

@andrej
Copy link
Collaborator Author

andrej commented Jun 23, 2024

@jgmelber Could you kick off the tests on this please?

@andrej
Copy link
Collaborator Author

andrej commented Jun 24, 2024

Looks like tests pass but I forgot to format. Another try please @jgmelber ?

@jgmelber jgmelber added this pull request to the merge queue Jun 24, 2024
Merged via the queue into Xilinx:main with commit 7917990 Jun 24, 2024
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Matmul examples don't satisfy numerics
3 participants