[GPU] Disable unaligned to instrinsic batch matmul codegen with vector distribute #18935
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This path doesnt support all batch matmul shapes but tries to and fails
e.g. #18601
So this PR makes this change because by default we should favor higher functionality support over performance. Solution is to keep this path behind a flag which is off by default.
Fixes : #18601
If we bail out here, we will go down SIMT (note that we do anyway for non batch matmul GEMMs for such shapes) for now with Tile and Fuse pipeline support planned for the future. In the time being models who have shapes that are supported by this path can do so using the provided flag. And tuners can always use this pipeline if it works for the shape. We can also turn this on by default if we can add correct heuristics on when it is okay to use this path.