You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As far as I know, we do not document any synchronisation behaviours for alpaka buffers.
However, different buffers and different back-ends effectively implement different behaviours.
A buffer allocated on a CUDA gpu with alpaka::allocBuf() internally used cudaMalloc() to allocate the memory, and cudaFree() to release it. While not mentioned explicitly in the CUDA documentation, the observed behaviour is that cudaMalloc() and cudaFree()¹ are blocking and synchronise across all kernels executing on the current GPU.
A buffer allocated on the host with alpaka::allocMappedBuf() for the CUDA platform internally uses cudaMallocHost() and cudaFreeHost(). I have no idea if these are blocking calls that synchronise with the current (or all) CUDA device(s).
A buffer allocated on the host with alpaka::allocBuf() internally uses aligned new and delete. These do not imply any synchronisations.
A buffer allocated on the host with alpaka::allocMappedBuf() for the CPU platform uses allocBuf(), and does not imply any synchronisations.
A buffer allocated with alpaka::allocAsyncBuf()should be queue-ordered and asynchronouse with respect to the host. But we should double check that it is indeed the current behaviour :-)
--
¹ except for memory allocated by cudaMallocAsync().
The text was updated successfully, but these errors were encountered:
As far as I know, we do not document any synchronisation behaviours for alpaka buffers.
However, different buffers and different back-ends effectively implement different behaviours.
A buffer allocated on a CUDA gpu with
alpaka::allocBuf()
internally usedcudaMalloc()
to allocate the memory, andcudaFree()
to release it. While not mentioned explicitly in the CUDA documentation, the observed behaviour is thatcudaMalloc()
andcudaFree()
¹ are blocking and synchronise across all kernels executing on the current GPU.A buffer allocated on the host with
alpaka::allocMappedBuf()
for the CUDA platform internally usescudaMallocHost()
andcudaFreeHost()
. I have no idea if these are blocking calls that synchronise with the current (or all) CUDA device(s).A buffer allocated on the host with
alpaka::allocBuf()
internally uses alignednew
anddelete
. These do not imply any synchronisations.A buffer allocated on the host with
alpaka::allocMappedBuf()
for the CPU platform usesallocBuf()
, and does not imply any synchronisations.A buffer allocated with
alpaka::allocAsyncBuf()
should be queue-ordered and asynchronouse with respect to the host. But we should double check that it is indeed the current behaviour :-)--
¹ except for memory allocated by
cudaMallocAsync()
.The text was updated successfully, but these errors were encountered: