Support for CUDA 12.4 and above? URGENT PERHAPS? #1292

BBC-Esq · 2024-10-22T17:26:36Z

Currently, the latest prebuilt wheels for FA2 only support up to CUDA 12.3...This is problematic since torch versions 2.3.1 through 2.5.0 only support CUDA 12.1 or CUDA 12.4 - i.e. not CUDA 12.3.

Further, recent models like minicpm 2.6, phi 3.5 mini, and deepseek coder lite either prefer flash attention 2 and/or will not work without it (e.g. using SDPA).

Further, Triton wheels 3.1.0 and above require torch 2.4.0+...

What I'm saying is...why haven't there been any releases of FA2 for CUDA support above and beyond CUDA 12.3??

The text was updated successfully, but these errors were encountered:

tridao · 2024-10-22T18:44:13Z

Because wheels compiled w CUDA 12.3 will work with 12.4, as long as the pytorch versions are the same.

BBC-Esq · 2024-10-22T18:51:22Z

Thanks for clarifying. With that being said, will you please clarify in the release notes? For example, when a wheel's name contains "cu118" I assume it'll only work with CUDA 11.8...not CUDA 12 and so on. And when it states "cu122" (e.g. release 2.6.0.post1) it's natural to interpret this as ONLY working with CUDA 12.2, nothing higher or lower.

Basically, can you clarify the release notes as to whether a wheel named " flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp311-cp311-linux_x86_64.whl ", for example, will work with:

CUDA 12.4, 12.5...and we're all the way up to 12.6 now too!

tridao · 2024-10-22T18:52:48Z

setup.py downloads the right wheel automatically

BBC-Esq · 2024-10-22T18:55:21Z

I understand that, but I'm using Windows and hence am using the wheels here:

https://github.com/bdashore3/flash-attention/releases/

Can you please just clarify to me what CUDA level (e.g. 12.6 even?) the release 2.6.3 supports, even if you don't want to update the release notes for me as a favor?

tridao · 2024-10-22T18:57:24Z

All CUDA 12 minor versions are compatible

BBC-Esq · 2024-10-22T19:03:41Z

Do you care to update the release notes by adding a sentence, which should take you less than 5 minutes, for myself and the thousands of others who use this great library? Whether you do or don't, you might leave this issue open for others with the same question...or close it, my feelings won't be hurt. Thanks for clarifying at any rate.

tridao · 2024-10-22T19:08:39Z

I'll probably change the wheel name to cu11 and cu12 to avoid confusion, thanks for the feedback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for CUDA 12.4 and above? URGENT PERHAPS? #1292

Support for CUDA 12.4 and above? URGENT PERHAPS? #1292

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024 •

edited

Loading

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

Support for CUDA 12.4 and above? URGENT PERHAPS? #1292

Support for CUDA 12.4 and above? URGENT PERHAPS? #1292

Comments

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024 • edited Loading

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024

tridao commented Oct 22, 2024

BBC-Esq commented Oct 22, 2024 •

edited

Loading