Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Disable unaligned to instrinsic batch matmul codegen with vector distribute #18935

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nirvedhmeshram
Copy link
Contributor

@nirvedhmeshram nirvedhmeshram commented Oct 29, 2024

This path doesnt support all batch matmul shapes but tries to and fails
e.g. #18601

So this PR makes this change because by default we should favor higher functionality support over performance. Solution is to keep this path behind a flag which is off by default.

Fixes : #18601

If we bail out here, we will go down SIMT (note that we do anyway for non batch matmul GEMMs for such shapes) for now with Tile and Fuse pipeline support planned for the future. In the time being models who have shapes that are supported by this path can do so using the provided flag. And tuners can always use this pipeline if it works for the shape. We can also turn this on by default if we can add correct heuristics on when it is okay to use this path.

…r distribute

Signed-off-by: Nirvedh <nirvedh@gmail.com>
@nirvedhmeshram nirvedhmeshram force-pushed the disable_unaligned_bmm_vectordistribute branch from bde1b32 to e602dde Compare October 29, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[gpu]: error: 'vector.transfer_read' op Anchoring on transfer_read with unsupported number of elements
1 participant