Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shader_debugprintf: support new VVL-DEBUG-PRINTF message and fix VVL version check for API selection #1187

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

SRSaunders
Copy link
Contributor

@SRSaunders SRSaunders commented Oct 9, 2024

Description

Fixes two issues that arose with Vulkan SDK 1.3.296:

  1. Supports new VVL-DEBUG-PRINTF callback message. Previous SDKs used WARNING-DEBUG-PRINTF or UNKNOWN-DEBUG-PRINTF. Without this fix the debug data is not available in the UI Overlay.
  2. Fixes my incorrect assumption that the Vulkan instance version matched the SDK version for all platforms - true on macOS but not true for Windows and Linux. This version is used to set the API level for the sample, which is important for performance and to avoid a previous defect in the Vulkan Validation layer. I have replaced the instance version check with a Validation Layer version check which is portable across all platforms: Win, Linux, macOS. Without this fix, performance is poor on Windows and Linux when using Vulkan SDK 1.3.296.

Fixes #1184.

Tested on Windows 10, Manjaro Linux, and macOS Ventura using Vulkan SDKs 1.3.290 and 1.3.296.

I hope this is the last time I have to fix this. It seems that VVL changes can easily break this sample.

General Checklist:

Please ensure the following points are checked:

  • My code follows the coding style
  • I have reviewed file licenses
  • I have commented any added functions (in line with Doxygen)
  • I have commented any code that could be hard to understand
  • My changes do not add any new compiler warnings
  • My changes do not add any new validation layer errors or warnings
  • I have used existing framework/helper functions where possible
  • My changes do not add any regressions
  • I have tested every sample to ensure everything runs correctly
  • This PR describes the scope and expected impact of the changes I am making

Note: The Samples CI runs a number of checks including:

  • I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
  • My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions

If this PR contains framework changes:

  • I did a full batch run using the batch command line argument to make sure all samples still work properly

Sample Checklist

If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:

  • I have tested the sample on at least one compliant Vulkan implementation
  • If the sample is vendor-specific, I have tagged it appropriately
  • I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
  • Any dependent assets have been merged and published in downstream modules
  • For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
  • For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
  • For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site

Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this PR. I do have some remarks though, mostly related to comment and code structure. I think it's important that people can easily follow understand the changes ;)

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 18, 2024

Thanks @SaschaWillems for the feedback. I am away on vacation this week, but will make the requested changes when I am back.

UPDATE: Back now and changes submitted in 0dc4963.

asuessenbach
asuessenbach previously approved these changes Oct 22, 2024
@SaschaWillems
Copy link
Collaborator

No idea why, but with this PR and the latest SDK (1.3.296) and in windows, this sample is now again running with less than 1 fps. Forcing it to use VK 1.2 is somehow even slower (0 or inf fps).

If I force VK 1.0 performance is fine, but I don't get any debug output.

Not sure what is happening here and why this sample is so problematic. The debug printf sample from m own samples repo works just fine no matter the api version :/

@SRSaunders
Copy link
Contributor Author

No idea why, but with this PR and the latest SDK (1.3.296) and in windows, this sample is now again running with less than 1 fps. Forcing it to use VK 1.2 is somehow even slower (0 or inf fps).

Very strange. Can I ask you to recheck before and after this PR, but being careful with your SDK version selection and project gen/build? I did a lot of testing with old and new SDKs on Windows 10, Linux and macOS before submitting originally. I will go back and test again to see if I can somehow duplicate what you are seeing.

If I force VK 1.0 performance is fine, but I don't get any debug output.

Debug PrintF requires Vulkan 1.1 or later. So no surprise that you are not getting debug output with API 1.0.

The debug printf sample from my own samples repo works just fine no matter the api version

I suspect your repo's sample relies on the instrinsic Debug PrintF capability at the shader level on Windows. However, this is not cross-platform portable. Whereas the Vulkan-Samples one uses the VVL version of the feature all the time. Perhaps that is why you are seeing a difference at least on Windows. Again, I will so back and see if I can verify this.

@SaschaWillems
Copy link
Collaborator

It also happens with the old code (before this PR). I only have SDK 1.3.296 installed.

So probably a regression in the validation layers?

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 23, 2024

Ok, I have rechecked this PR on Windows 10, and even fast-forwarded my local branch to current main HEAD just to make sure. I am using Vulkan SDK 1.3.296.0 with my Radeon RX6600XT GPU. My Vulkan Configurator has been reset to default settings.

Before this PR I get:
main only

After this PR I get:
shader_debugprintf FF

Is it possible that your Vulkan Configurator has a custom setting that is interfering with the sample? Or possibly a difference between AMD and nVidia GPUs? Just grasping at straws since I cannot duplicate your issue and the 1.3.296 VVL seems to be working correctly using API 1.1 for debug printf.

Copy link
Contributor

@asuessenbach asuessenbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, I have to distinguish two cases:

  1. VulkanConfigurator is running
    VK_EXT_LAYER_SETTINGS_EXTENSION_NAME is available
    instance creation is done by VulkanSample::create_instance (line 469)
    render speed is high
    debug_utils_message_callback is never called, thus no debugprintf output
  2. VulkanConfigurator is not running
    VK_EXT_LAYER_SETTINGS_EXTENSION_NAME is not available
    instance creation is done locally (line 523)
    render speed is extremely low
    debug_utils_message_callback is called, with higher rate than the frame rate

Note, in case 2, you're using VkValidationFeaturesEXT, which is part of VK_EXT_VALIDATION_FEATURES_EXTENSION_NAME. But you don't ask for it in the ShaderDebugPrintf constructor (or anywhere else). And in fact, that extension is not supported on my machine. Strange, that the VVL doesn't cry there.

@SaschaWillems
Copy link
Collaborator

That would explain why it's so slow for me. I never ran that sample with the VulkanConfigurator running. That's case 2.

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 24, 2024

Thanks @asuessenbach for pointing out the missing VK_EXT_validation_features extension. I have made a few changes that might make a difference as follows:

  1. Moved layer settings out of the constructor, and into ShaderDebugPrintf::create_instance(). Now it will run only if the VK_EXT_layer_settings extension is available. This part is for encapsulation only and will not change behaviour.
  2. Added and enabled the VK_EXT_validation_features extension when the VK_EXT_layer_settings extension is not available at runtime. This might change behaviour, but I am concerned about @asuessenbach's comment that the extension is not available on his machine. I'm not sure how that is possible.
  3. Fixed an incorrect string comparison operation for VK_EXT_layer_settings in [HPP]Instance::[HPP]Instance(). This was my mistake from an earlier PR. This could have prevented proper specification of the validation layer feature settings when VK_EXT_layer_settings is active. Again, this could change behaviour.

These changes may not be the final solution as I have observed the following when testing:

  1. Linux (Manjaro) using Vulkan 1.3.295 (from pkg mgr) and VVL 1.3.290 (from pkg mgr): this PR works properly (good frame rate, debug data available) when running with vkconfig and without. VK_EXT_layer_settings is only available when vkconfig is active. In this case the debug data is available both in the UI and in the stdout console. No performance issues are visible in either case.
  2. macOS (Ventura) using Vulkan SDK 1.3.296: this PR works properly (good frame rate, debug data available) when running with vkconfig and without. VK_EXT_layer_settings is available both when vkconfig is inactive and active - this is a difference vs Linux. In the latter case (vkconfig active) the debug data is available both in the UI and in the stdout console. No performance issues are present. Also tested with Vulkan SDK 1.3.290 and the results are the same - no performance problems. The only issue is that vkconfig does not appear to recognize the repeated message limit for the new VVL-* messages (vs. the previous INFO-* or WARNING-* messages, etc). A minor issue but likely a bug.
  3. Windows 10 using Vulkan SDK 1.3.296 with my AMD 6600XT GPU: this PR works properly (good frame rate, debug data available) when running without vkconfig only. When vkconfig is active, the sample will not start and complains about an unsupported extension during vkCreateInstance(). However, VK_EXT_layer_settings is available during enumeration when vkconfig is active. Something very strange is going on here - either a bug on the Windows side or something I do not understand. I am not sure how VK_EXT_layer_settings can be enumerated but not supported. See my console output in this case:

nolayerext

In summary:

  1. Linux: works properly using VVL 1.3.290 with and without vkconfig. Can't test VVL 1.3.296 since it is not yet available as a package for my Manjaro distro.
  2. macOS: works properly using VVL 1.3.290 and 1.3.296 with and without vkconfig.
  3. Windows 10 on AMD 6600XT GPU: works properly using VVL 1.3.296 without vkconfig only.

Lastly, I thought VK_EXT_layer_settings was meant to replace and deprecate VK_EXT_validation_features. I don't understand why VK_EXT_layer_settings is available all the time on macOS, but for Windows and Linux seems to be enabled only when vkconfig is running. This seems incorrect to me. Can you explain this?

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 24, 2024

Ok, I think I have finally figured it out. It appears that you don't need to actually enable the VK_EXT_layer_settings extension in order to use it. I’m not sure if this is a feature or a bug. In any case, I have updated the sample and [HPP]Instance::[HPP]Instance() to check for availability of the extension vs. enablement. This approach works across all platforms and behaviours appear to be consistent now:

  1. Sample is tolerant of Vulkan SDK versions: tested against VVL 1.3.290 (Win, Linux, macOS) and 1.3.296 (Win, macOS)
  2. Sample is tolerant of vkconfig running or not running. The only thing to be careful of when running vkconfig is to make sure "Limit Duplicated Messages" is turned off - otherwise debug callback messages will be suppressed and the debug output UI will be blank.

@asuessenbach
Copy link
Contributor

AFAIK, those two extensions (VK_EXT_layer_settings and VK_EXT_validation_features) are not supported by any NVIDIA GPU, but are provided by a layer injected by for example the VulkanConfigurator. That might explain why it's that slow.

Besides that, just to make sure it has been noted: As VK_EXT_validation_features is deprecated in favour of VK_EXT_layer_settings, using VK_EXT_validation_features would just be a fallback solution. Don't know, if it's worth to have that. And you should bail out in a friendly way, if none of those extensions is available, maybe with a hint to the VulkanConfigurator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shader_debugprintf problems with new VulkanSDK 1.3.296
3 participants