Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use CUPTI_API_VERSION instead of CUDA_VERSION (#792)
Summary: It seems like CUPTI version and CUDA version can have a mismatch; in which case we should care about CUPTI_API_VERSION in this case, because CUPTI is where this enum is defined. Verified on a H100 machine with this script ```python import torch def fn(x, y, z): return torch.addmm(z, x, y) x, y, z = [torch.rand((16, 16), device='cuda') for _ in range(3)] with torch.profiler.profile() as prof: for i in range(4): fn(x, y, z) prof.export_chrome_trace("profile_addmm.json") ``` I verified (on H100): * Going to before the cudaLaunchKernelExC changes in kineto, I can find "INVALID" in the profile_addmm.json * Going to current main branch, I cannot find "INVALID" in the profile_addmm.json * On this branch, I cannot find "INVALID" in the profile_addmm.json ^ i.e., this confirms that (a) my test works to reproduce the behavior, and (b) this PR doesn't break support for cudaLaunchKernelExC. Pull Request resolved: #792 Reviewed By: aaronenyeshi Differential Revision: D47823117 Pulled By: davidberard98 fbshipit-source-id: 102d23a23345327c229a7d4664a1781d7c259855
- Loading branch information