Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cuda] Optimize device signal to host wait synchronization #14876

Merged
merged 2 commits into from
Aug 31, 2023

Conversation

antiagainst
Copy link
Contributor

When we have a device CUevent signaling past a desired timepoint waited by the host, we can actually wait on the CUevent using cuEventSynchronize. This is profiled to be quite faster than waiting on the full flow of device singal -> cuLaunchHostFunc (async) -> host signal -> host polling.

When we have a device `CUevent` signaling past a desired timepoint
waited by the host, we can actually wait on the `CUevent` using
`cuEventSynchronize`. This is profiled to be quite faster than
waiting on the full flow of device singal -> cuLaunchHostFunc
(async) -> host signal -> host polling.
@antiagainst antiagainst added the hal/cuda Runtime CUDA HAL backend label Aug 30, 2023
experimental/cuda2/event_semaphore.c Outdated Show resolved Hide resolved
experimental/cuda2/event_semaphore.c Outdated Show resolved Hide resolved
experimental/cuda2/event_semaphore.c Outdated Show resolved Hide resolved
experimental/cuda2/event_semaphore.c Outdated Show resolved Hide resolved
@antiagainst antiagainst merged commit babd4d9 into iree-org:main Aug 31, 2023
46 checks passed
@antiagainst antiagainst deleted the cuda2-d2h-opt-pr branch August 31, 2023 17:18
jinchen62 pushed a commit to jinchen62/iree that referenced this pull request Sep 18, 2023
…14876)

When we have a device `CUevent` signaling past a desired timepoint
waited by the host, we can actually wait on the `CUevent` using
`cuEventSynchronize`. This is profiled to be quite faster than waiting
on the full flow of device singal -> cuLaunchHostFunc (async) -> host
signal -> host polling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hal/cuda Runtime CUDA HAL backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants