-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for CUDA 12.4 and above? URGENT PERHAPS? #1292
Comments
Because wheels compiled w CUDA 12.3 will work with 12.4, as long as the pytorch versions are the same. |
Thanks for clarifying. With that being said, will you please clarify in the release notes? For example, when a wheel's name contains "cu118" I assume it'll only work with CUDA 11.8...not CUDA 12 and so on. And when it states "cu122" (e.g. release 2.6.0.post1) it's natural to interpret this as ONLY working with CUDA 12.2, nothing higher or lower. Basically, can you clarify the release notes as to whether a wheel named " flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp311-cp311-linux_x86_64.whl ", for example, will work with: CUDA 12.4, 12.5...and we're all the way up to 12.6 now too! |
setup.py downloads the right wheel automatically |
I understand that, but I'm using Windows and hence am using the wheels here: https://github.com/bdashore3/flash-attention/releases/ Can you please just clarify to me what CUDA level (e.g. 12.6 even?) the release 2.6.3 supports, even if you don't want to update the release notes for me as a favor? |
All CUDA 12 minor versions are compatible |
Do you care to update the release notes by adding a sentence, which should take you less than 5 minutes, for myself and the thousands of others who use this great library? Whether you do or don't, you might leave this issue open for others with the same question...or close it, my feelings won't be hurt. Thanks for clarifying at any rate. |
I'll probably change the wheel name to |
Currently, the latest prebuilt wheels for FA2 only support up to CUDA 12.3...This is problematic since
torch
versions 2.3.1 through 2.5.0 only support CUDA 12.1 or CUDA 12.4 - i.e. not CUDA 12.3.Further, recent models like
minicpm 2.6
,phi 3.5 mini
, anddeepseek coder lite
either prefer flash attention 2 and/or will not work without it (e.g. using SDPA).Further, Triton wheels 3.1.0 and above require torch 2.4.0+...
What I'm saying is...why haven't there been any releases of FA2 for CUDA support above and beyond CUDA 12.3??
The text was updated successfully, but these errors were encountered: