Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda interop vk13 #637

Open
wants to merge 64 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
d293b9b
create exportable buffers to import into cuda
atkurtul Jul 8, 2023
f5f1017
add missing cuda fn and update submodule
atkurtul Jul 9, 2023
6689b33
add missing cuda export functions
atkurtul Jul 9, 2023
9ade1c6
move boilerplates to CCUDADevice
atkurtul Jul 9, 2023
bfa7afc
correct chained cleanup desctruction order
atkurtul Jul 15, 2023
ddb861e
add safety checks
atkurtul Jul 15, 2023
f380398
semaphore interop
atkurtul Jul 15, 2023
2f7b517
get cuda interop working in vulkan_1_3 branch
atkurtul Jul 15, 2023
bd32f36
point jitify to the right hash
atkurtul Jan 4, 2024
b1c5a46
update examples && use non KHR version of vk functions
atkurtul Jan 4, 2024
0d36581
correct bad validations, KHR instead of coe func usage etc.
devshgraphicsprogramming Jan 4, 2024
725a984
revert a dangerous api change
devshgraphicsprogramming Jan 4, 2024
d2c9382
update examples_tests
devshgraphicsprogramming Jan 4, 2024
2d24604
Disabled CSPIRVIntrospector
Przemog1 Jan 5, 2024
2114e50
small fixes
Przemog1 Jan 5, 2024
f6320ce
remove unused cruft
devshgraphicsprogramming Jan 6, 2024
f749ab8
draft
devshgraphicsprogramming Jan 7, 2024
ad1e6ff
move the TimelineEventHandlers to their own header, simplifying every…
devshgraphicsprogramming Jan 8, 2024
a1afcc8
Made the TimelineEventHandlerST use a const ISemaphore, almost all of…
devshgraphicsprogramming Jan 8, 2024
262281f
implement MultiTimelineEventHandlerST and correct TimelineEventHandlerST
devshgraphicsprogramming Jan 8, 2024
d7690be
fix KHR function loading bugs
devshgraphicsprogramming Jan 8, 2024
13ff02a
fix some nasty bug in TimelineEventHandlerST
devshgraphicsprogramming Jan 8, 2024
fabc999
Take the TimelineEventHandlerST for a first spin with ICommandPoolCache
devshgraphicsprogramming Jan 8, 2024
0eb8e9a
turns out its quite easy to port the other utilities to the new Multi…
devshgraphicsprogramming Jan 8, 2024
e59408d
remove more unused stuff
devshgraphicsprogramming Jan 8, 2024
3f41a81
fix one liner huge bug
devshgraphicsprogramming Jan 8, 2024
fb1f50d
fix a smal bug and introduce a base class for TimelineEventHandler, a…
devshgraphicsprogramming Jan 9, 2024
94ee680
fix one more KHR function pointer bug and remove unused class
devshgraphicsprogramming Jan 9, 2024
c761d42
bring back bits of IUtilities needed for ex 05
devshgraphicsprogramming Jan 9, 2024
04689b9
device cap traits
atkurtul Dec 5, 2023
4a17eaf
port macros to boost pp
atkurtul Dec 5, 2023
5fcad02
has_member_x_with_type
atkurtul Dec 5, 2023
3c97ef1
make e_member_presence bitflags
atkurtul Dec 5, 2023
06b43af
Use new inline SPIR-V builtin syntax from DXC
devshgraphicsprogramming Jan 10, 2024
fd73e28
const correctness on surface capabilities
devshgraphicsprogramming Jan 12, 2024
153dd21
3D Blit test case was failing because of unimplemented functions for …
devshgraphicsprogramming Jan 12, 2024
bc7e24d
Make the SPhysicalDeviceFilter use spans for requirement arrays.
devshgraphicsprogramming Jan 12, 2024
b234d3b
ok so I found out that renderdoc hates External memory
devshgraphicsprogramming Jan 12, 2024
b5a633a
fix typos causing issues
devshgraphicsprogramming Jan 12, 2024
2ab33ed
API draft
devshgraphicsprogramming Jan 12, 2024
bbc5aa9
think about the other 3 utility functions
devshgraphicsprogramming Jan 12, 2024
d41f279
design clearing up
devshgraphicsprogramming Jan 12, 2024
04d05da
Ok we're done here with the Streaming Buffer upload port (removed the…
devshgraphicsprogramming Jan 12, 2024
3d034c5
move the SIntendedSubmitInfo struct out of IUtilities
devshgraphicsprogramming Jan 12, 2024
3160a46
going to sleep, next TODO is to implement the IUtilities::downloadBuf…
devshgraphicsprogramming Jan 12, 2024
8670d42
outline the TODO for @theoreticalphysicsftw
devshgraphicsprogramming Jan 13, 2024
2d86373
fix debugmessenger not being created
atkurtul Jan 13, 2024
ca2593c
fix a validation error
devshgraphicsprogramming Jan 13, 2024
461cb4a
rework pipeline barriers and events to use std::spans
devshgraphicsprogramming Jan 13, 2024
d96fd1d
Port `downloadBufferRangeViaStagingBuffer
devshgraphicsprogramming Jan 13, 2024
2d2acc9
fix bug in CRAIISpanPatch
devshgraphicsprogramming Jan 13, 2024
60c1c39
Ported Example 23, and fixed a few bugs here and there
devshgraphicsprogramming Jan 14, 2024
3faf1fb
merge conflicts
atkurtul Jan 13, 2024
fd4f733
add missing external resource property queries
atkurtul Jan 14, 2024
5b1940c
add more stuff
atkurtul Jan 14, 2024
7074256
Merge branch 'vulkan_1_3' into cuda-interop-vk13
atkurtul Jan 14, 2024
6449b2f
Merge branch 'vulkan_1_3' into cuda-interop-vk13
atkurtul Jan 18, 2024
3d9a530
address pr comments
atkurtul Jan 18, 2024
4d174e5
last commit part 2
atkurtul Jan 18, 2024
cbd18f4
add missing cuda fn & map queue indices to vk
atkurtul Jan 18, 2024
23fe8d4
update submodule
atkurtul Jan 18, 2024
c32fd79
cache cuda devices
atkurtul Jan 18, 2024
4e2185c
ifdef platform code
atkurtul Jan 19, 2024
bd0b76a
log queue validation warning
atkurtul Jan 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion include/nbl/video/CCUDASharedMemory.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class CCUDASharedMemory : public core::IReferenceCounted

core::smart_refctd_ptr<IDeviceMemoryAllocation> exportAsMemory(ILogicalDevice* device, IDeviceMemoryBacked* dedication = nullptr) const;

core::smart_refctd_ptr<IGPUImage> exportAsImage(ILogicalDevice* device, asset::IImage::SCreationParams&& params) const;
core::smart_refctd_ptr<IGPUImage> createAndBindImage(ILogicalDevice* device, IGPUImage::SCreationParams&& params) const;

protected:

Expand Down
6 changes: 3 additions & 3 deletions include/nbl/video/CVulkanDeviceMemoryBacked.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,11 @@ class CVulkanDeviceMemoryBacked : public Interface
protected:
// special constructor for when memory requirements are known up-front (so far only swapchains and internal forwarding here)
CVulkanDeviceMemoryBacked(const CVulkanLogicalDevice* dev, Interface::SCreationParams&& _creationParams, const IDeviceMemoryBacked::SDeviceMemoryRequirements& _memReqs, const VkResource_t vkHandle);
CVulkanDeviceMemoryBacked(const CVulkanLogicalDevice* dev, Interface::SCreationParams&& _creationParams, const VkResource_t vkHandle) :
CVulkanDeviceMemoryBacked(dev,std::move(_creationParams),obtainRequirements(dev,vkHandle),vkHandle) {}
CVulkanDeviceMemoryBacked(const CVulkanLogicalDevice* dev, Interface::SCreationParams&& _creationParams, bool dedicatedOnly, const VkResource_t vkHandle) :
CVulkanDeviceMemoryBacked(dev,std::move(_creationParams), obtainRequirements(dev, dedicatedOnly, vkHandle),vkHandle) {}

private:
static IDeviceMemoryBacked::SDeviceMemoryRequirements obtainRequirements(const CVulkanLogicalDevice* device, const VkResource_t vkHandle);
static IDeviceMemoryBacked::SDeviceMemoryRequirements obtainRequirements(const CVulkanLogicalDevice* device, bool dedicatedOnly, const VkResource_t vkHandle);

core::smart_refctd_ptr<IDeviceMemoryAllocation> m_memory = nullptr;
size_t m_offset = 0u;
Expand Down
6 changes: 4 additions & 2 deletions include/nbl/video/IPhysicalDevice.h
Original file line number Diff line number Diff line change
Expand Up @@ -793,11 +793,12 @@ class NBL_API2 IPhysicalDevice : public core::Interface, public core::Unmovable
SExternalImageFormatProperties getExternalImageProperties(
asset::E_FORMAT format,
IGPUImage::TILING tiling,
IGPUImage::E_TYPE type,
core::bitflag<IGPUImage::E_USAGE_FLAGS> usage,
core::bitflag<IGPUImage::E_CREATE_FLAGS> flags,
IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE handleType) const
{
auto key = std::tuple{ format, tiling, usage, flags, handleType };
auto key = std::tuple{ format, tiling, type, usage, flags, handleType };
{
std::shared_lock lock(m_externalImagePropertiesMutex);
auto it = m_externalImageProperties.find(key);
Expand All @@ -806,7 +807,7 @@ class NBL_API2 IPhysicalDevice : public core::Interface, public core::Unmovable
}

std::unique_lock lock(m_externalImagePropertiesMutex);
return m_externalImageProperties[key] = getExternalImageProperties_impl(format, tiling, usage, flags, handleType);
return m_externalImageProperties[key] = getExternalImageProperties_impl(format, tiling, type, usage, flags, handleType);
}

protected:
Expand Down Expand Up @@ -878,6 +879,7 @@ class NBL_API2 IPhysicalDevice : public core::Interface, public core::Unmovable
virtual SExternalImageFormatProperties getExternalImageProperties_impl(
asset::E_FORMAT format,
IGPUImage::TILING tiling,
IGPUImage::E_TYPE type,
core::bitflag<IGPUImage::E_USAGE_FLAGS> usage,
core::bitflag<IGPUImage::E_CREATE_FLAGS> flags,
IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE handleType) const = 0;
Expand Down
8 changes: 2 additions & 6 deletions src/nbl/video/CCUDASharedMemory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,16 +77,12 @@ core::smart_refctd_ptr<IGPUBuffer> CCUDASharedMemory::exportAsBuffer(ILogicalDev

#endif

core::smart_refctd_ptr<IGPUImage> CCUDASharedMemory::exportAsImage(ILogicalDevice* device, asset::IImage::SCreationParams&& params) const
core::smart_refctd_ptr<IGPUImage> CCUDASharedMemory::createAndBindImage(ILogicalDevice* device, IGPUImage::SCreationParams&& params) const
{
if (!device || !m_device->isMatchingDevice(device->getPhysicalDevice()))
return nullptr;

auto img = device->createImage({
std::move(params), {{ .externalHandleTypes = CCUDADevice::EXTERNAL_MEMORY_HANDLE_TYPE }},
IGPUImage::TILING::LINEAR,
1 /*preinitialized*/,
});
auto img = device->createImage(std::move(params));
atkurtul marked this conversation as resolved.
Show resolved Hide resolved

if (exportAsMemory(device, img.get()))
return img;
Expand Down
2 changes: 1 addition & 1 deletion src/nbl/video/CVulkanBuffer.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ class CVulkanBuffer : public CVulkanDeviceMemoryBacked<IGPUBuffer>
using base_t = CVulkanDeviceMemoryBacked<IGPUBuffer>;

public:
inline CVulkanBuffer(const CVulkanLogicalDevice* dev, IGPUBuffer::SCreationParams&& creationParams, const VkBuffer buffer) : base_t(dev,std::move(creationParams),buffer) {}
inline CVulkanBuffer(const CVulkanLogicalDevice* dev, IGPUBuffer::SCreationParams&& creationParams, bool dedicatedOnly, const VkBuffer buffer) : base_t(dev,std::move(creationParams), dedicatedOnly, buffer) {}

void setObjectDebugName(const char* label) const override;

Expand Down
6 changes: 3 additions & 3 deletions src/nbl/video/CVulkanDeviceMemoryBacked.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ namespace nbl::video
{

template<class Interface>
IDeviceMemoryBacked::SDeviceMemoryRequirements CVulkanDeviceMemoryBacked<Interface>::obtainRequirements(const CVulkanLogicalDevice* device, const VkResource_t vkHandle)
IDeviceMemoryBacked::SDeviceMemoryRequirements CVulkanDeviceMemoryBacked<Interface>::obtainRequirements(const CVulkanLogicalDevice* device, bool dedicatedOnly, const VkResource_t vkHandle)
{
const std::conditional_t<IsImage,VkImageMemoryRequirementsInfo2,VkBufferMemoryRequirementsInfo2> vk_memoryRequirementsInfo = {
IsImage ? VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2:VK_STRUCTURE_TYPE_BUFFER_MEMORY_REQUIREMENTS_INFO_2,nullptr,vkHandle
Expand All @@ -24,8 +24,8 @@ IDeviceMemoryBacked::SDeviceMemoryRequirements CVulkanDeviceMemoryBacked<Interfa
memoryReqs.size = vk_memoryRequirements.memoryRequirements.size;
memoryReqs.memoryTypeBits = vk_memoryRequirements.memoryRequirements.memoryTypeBits;
memoryReqs.alignmentLog2 = std::log2(vk_memoryRequirements.memoryRequirements.alignment);
memoryReqs.prefersDedicatedAllocation = vk_dedicatedMemoryRequirements.prefersDedicatedAllocation;
memoryReqs.requiresDedicatedAllocation = vk_dedicatedMemoryRequirements.requiresDedicatedAllocation;
memoryReqs.prefersDedicatedAllocation = dedicatedOnly | vk_dedicatedMemoryRequirements.prefersDedicatedAllocation;
memoryReqs.requiresDedicatedAllocation = dedicatedOnly | vk_dedicatedMemoryRequirements.requiresDedicatedAllocation;
return memoryReqs;
}

Expand Down
60 changes: 48 additions & 12 deletions src/nbl/video/CVulkanLogicalDevice.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -378,7 +378,7 @@ core::smart_refctd_ptr<IGPUBuffer> CVulkanLogicalDevice::createBuffer_impl(IGPUB
VkBuffer vk_buffer;
if (m_devf.vk.vkCreateBuffer(m_vkdev,&vk_createInfo,nullptr,&vk_buffer)!=VK_SUCCESS)
return nullptr;
return core::make_smart_refctd_ptr<CVulkanBuffer>(this,std::move(creationParams),vk_buffer);
return core::make_smart_refctd_ptr<CVulkanBuffer>(this,std::move(creationParams), dedicatedOnly, vk_buffer);
}

core::smart_refctd_ptr<IGPUBufferView> CVulkanLogicalDevice::createBufferView_impl(const asset::SBufferRange<const IGPUBuffer>& underlying, const asset::E_FORMAT _fmt)
Expand All @@ -399,17 +399,24 @@ core::smart_refctd_ptr<IGPUBufferView> CVulkanLogicalDevice::createBufferView_im

core::smart_refctd_ptr<IGPUImage> CVulkanLogicalDevice::createImage_impl(IGPUImage::SCreationParams&& params)
{
VkImageStencilUsageCreateInfo vk_stencilUsage = { VK_STRUCTURE_TYPE_IMAGE_STENCIL_USAGE_CREATE_INFO, nullptr };
vk_stencilUsage.stencilUsage = getVkImageUsageFlagsFromImageUsageFlags(params.actualStencilUsage().value,true);
VkExternalMemoryImageCreateInfo externalMemoryInfo = {
.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO,
.handleTypes = params.externalHandleTypes.value,
};

const bool external = params.externalHandleTypes.value;

VkImageStencilUsageCreateInfo vk_stencilUsage = { VK_STRUCTURE_TYPE_IMAGE_STENCIL_USAGE_CREATE_INFO, &externalMemoryInfo };
vk_stencilUsage.stencilUsage = getVkImageUsageFlagsFromImageUsageFlags(params.actualStencilUsage().value, true);

std::array<VkFormat,asset::E_FORMAT::EF_COUNT> vk_formatList;
std::array<VkFormat, asset::E_FORMAT::EF_COUNT> vk_formatList;
VkImageFormatListCreateInfo vk_formatListStruct = { VK_STRUCTURE_TYPE_IMAGE_FORMAT_LIST_CREATE_INFO, &vk_stencilUsage };
vk_formatListStruct.viewFormatCount = 0u;
// if only there existed a nice iterator that would let me iterate over set bits 64 faster
if (params.viewFormats.any())
for (auto fmt=0; fmt<vk_formatList.size(); fmt++)
if (params.viewFormats.test(fmt))
vk_formatList[vk_formatListStruct.viewFormatCount++] = getVkFormatFromFormat(static_cast<asset::E_FORMAT>(fmt));
for (auto fmt = 0; fmt < vk_formatList.size(); fmt++)
if (params.viewFormats.test(fmt))
vk_formatList[vk_formatListStruct.viewFormatCount++] = getVkFormatFromFormat(static_cast<asset::E_FORMAT>(fmt));
vk_formatListStruct.pViewFormats = vk_formatList.data();

VkImageCreateInfo vk_createInfo = { VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, &vk_formatListStruct };
Expand All @@ -421,16 +428,45 @@ core::smart_refctd_ptr<IGPUImage> CVulkanLogicalDevice::createImage_impl(IGPUIma
vk_createInfo.arrayLayers = params.arrayLayers;
vk_createInfo.samples = static_cast<VkSampleCountFlagBits>(params.samples);
vk_createInfo.tiling = static_cast<VkImageTiling>(params.tiling);
vk_createInfo.usage = getVkImageUsageFlagsFromImageUsageFlags(params.usage.value,asset::isDepthOrStencilFormat(params.format));
vk_createInfo.sharingMode = params.isConcurrentSharing() ? VK_SHARING_MODE_CONCURRENT:VK_SHARING_MODE_EXCLUSIVE;
vk_createInfo.usage = getVkImageUsageFlagsFromImageUsageFlags(params.usage.value, asset::isDepthOrStencilFormat(params.format));
vk_createInfo.sharingMode = params.isConcurrentSharing() ? VK_SHARING_MODE_CONCURRENT : VK_SHARING_MODE_EXCLUSIVE;
vk_createInfo.queueFamilyIndexCount = params.queueFamilyIndexCount;
vk_createInfo.pQueueFamilyIndices = params.queueFamilyIndices;
vk_createInfo.initialLayout = params.preinitialized ? VK_IMAGE_LAYOUT_PREINITIALIZED:VK_IMAGE_LAYOUT_UNDEFINED;
vk_createInfo.initialLayout = params.preinitialized ? VK_IMAGE_LAYOUT_PREINITIALIZED : VK_IMAGE_LAYOUT_UNDEFINED;

bool dedicatedOnly = false;
if (external)
{
core::bitflag<IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE> requestedTypes = params.externalHandleTypes;
auto pd = dynamic_cast<const CVulkanPhysicalDevice*>(m_physicalDevice)->getInternalObject();
while (const auto idx = hlsl::findLSB(static_cast<uint32_t>(requestedTypes.value)) + 1)
{
const auto handleType = static_cast<IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE>(1u << (idx - 1));
requestedTypes ^= handleType;

auto props = m_physicalDevice->getExternalImageProperties(params.format, params.tiling, params.type, params.usage, params.flags, handleType);

if (props.maxArrayLayers < vk_createInfo.arrayLayers ||
!core::bitflag<IGPUImage::E_SAMPLE_COUNT_FLAGS>(props.sampleCounts).hasFlags(params.samples) ||
/* props.maxResourceSize?? */
props.maxExtent.width < vk_createInfo.extent.width ||
props.maxExtent.height < vk_createInfo.extent.height ||
props.maxExtent.depth < vk_createInfo.extent.depth)
{
return nullptr;
}

if (!core::bitflag(static_cast<IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE>(props.compatibleTypes)).hasFlags(params.externalHandleTypes)) // incompatibility between requested types
return nullptr;

dedicatedOnly |= props.dedicatedOnly;
}
}
atkurtul marked this conversation as resolved.
Show resolved Hide resolved

VkImage vk_image;
if (m_devf.vk.vkCreateImage(m_vkdev,&vk_createInfo,nullptr,&vk_image)!=VK_SUCCESS)
if (m_devf.vk.vkCreateImage(m_vkdev, &vk_createInfo, nullptr, &vk_image) != VK_SUCCESS)
return nullptr;
return core::make_smart_refctd_ptr<CVulkanImage>(this,std::move(params),vk_image);
return core::make_smart_refctd_ptr<CVulkanImage>(this, std::move(params), dedicatedOnly, vk_image);
}

core::smart_refctd_ptr<IGPUImageView> CVulkanLogicalDevice::createImageView_impl(IGPUImageView::SCreationParams&& params)
Expand Down
8 changes: 6 additions & 2 deletions src/nbl/video/CVulkanPhysicalDevice.h
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ class CVulkanPhysicalDevice final : public IPhysicalDevice
SExternalImageFormatProperties getExternalImageProperties_impl(
asset::E_FORMAT format,
IGPUImage::TILING tiling,
IGPUImage::E_TYPE type,
core::bitflag<IGPUImage::E_USAGE_FLAGS> usage,
core::bitflag<IGPUImage::E_CREATE_FLAGS> flags,
IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE handleType) const override
Expand All @@ -150,7 +151,8 @@ class CVulkanPhysicalDevice final : public IPhysicalDevice
VkPhysicalDeviceImageFormatInfo2 info = {
.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2,
.pNext = &extInfo,
.format = static_cast<VkFormat>(format),
.format = getVkFormatFromFormat(format),
.type = static_cast<VkImageType>(type),
.tiling = static_cast<VkImageTiling>(tiling),
.usage = usage.value,
.flags = flags.value,
Expand All @@ -163,7 +165,9 @@ class CVulkanPhysicalDevice final : public IPhysicalDevice
.pNext = &externalProps,
};

vkGetPhysicalDeviceImageFormatProperties2(m_vkPhysicalDevice, &info, &props);
VkResult re = vkGetPhysicalDeviceImageFormatProperties2(m_vkPhysicalDevice, &info, &props);
if(VK_SUCCESS != re)
return {};

return
{
Expand Down