Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some features (cutlassF, smallkF, ...) appear to be unavailable when executing 'python -m xformers.info' #21

Open
Zars19 opened this issue Aug 22, 2024 · 2 comments

Comments

@Zars19
Copy link

Zars19 commented Aug 22, 2024

❓ Questions and Help

Some features appear to be unavailable when executing 'python -m xformers.info' (cutlassF, smallkF, ...)
Is this normal?

xFormers 0.0.27+7a04357.d20240822
memory_efficient_attention.ckF:                    available
memory_efficient_attention.ckB:                    available
memory_efficient_attention.ck_decoderF:            available
memory_efficient_attention.ck_splitKF:             available
memory_efficient_attention.cutlassF:               unavailable
memory_efficient_attention.cutlassB:               unavailable
memory_efficient_attention.decoderF:               unavailable
memory_efficient_attention.flshattF@2.5.6-pt:      available
memory_efficient_attention.flshattB@2.5.6-pt:      available
memory_efficient_attention.smallkF:                unavailable
memory_efficient_attention.smallkB:                unavailable
memory_efficient_attention.triton_splitKF:         available
indexing.scaled_index_addF:                        available
indexing.scaled_index_addB:                        available
indexing.index_select:                             available
sequence_parallel_fused.write_values:              available
sequence_parallel_fused.wait_values:               available
sequence_parallel_fused.cuda_memset_32b_async:     available
sp24.sparse24_sparsify_both_ways:                  available
sp24.sparse24_apply:                               available
sp24.sparse24_apply_dense_output:                  available
sp24._sparse24_gemm:                               available
sp24._cslt_sparse_mm@0.0.0:                        available
swiglu.dual_gemm_silu:                             available
swiglu.gemm_fused_operand_sum:                     available
swiglu.fused.p.cpp:                                available
is_triton_available:                               True
pytorch.version:                                   2.4.0+rocm6.1
pytorch.cuda:                                      available
gpu.compute_capability:                            9.4
gpu.name:                                          AMD Radeon Graphics
dcgm_profiler:                                     unavailable
build.info:                                        available
build.cuda_version:                                None
build.hip_version:                                 6.2.41133-dd7f95766
build.python_version:                              3.9.19
build.torch_version:                               2.4.0+rocm6.1
build.env.TORCH_CUDA_ARCH_LIST:                    None
build.env.PYTORCH_ROCM_ARCH:                       gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100;gfx1101;gfx940;gfx941;gfx942
build.env.XFORMERS_BUILD_TYPE:                     None
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS:        None
build.env.NVCC_FLAGS:                              None
build.env.XFORMERS_PACKAGE_FROM:                   None
source.privacy:                                    open source
@Zars19
Copy link
Author

Zars19 commented Aug 22, 2024

And the result of test_mem_eff_attention.py is:
2489 failed, 3483 passed, 9033 skipped, 36 warnings in 2539.42s (0:42:19)

@tenpercent
Copy link
Collaborator

Hi @Zars19

CUTLASS-related extensions are only compiled for CUDA (not ROCm) builds.

The currently failing tests are related to the pytorch-internal Flash Attention implementation, and this op should be disabled on ROCm due to lack of support of tested features

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants