-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug printf reports "Mixed stage shader module not supported" #4350
Comments
I am fairly certain this restriction can be lifted fairly easily. I think we only put this restriction in because we weren't certain this capability (multiple stages in a single module) would be used and so didn't want to spend the effort making sure it worked. I will take a look and report back. |
Aside from debug printf, please keep in mind that there are currently known holes in general shader validation when it comes to multiple entry points, particularly with respect to push constants and transform feedback. See #2450 for some more details. |
That's good to hear, and thanks for the heads up about push constants and transform feedback. For what it's worth, it seems I can use a regex to remove all lines starting with "OpEntryPoint" that aren't used by the current shader module, and then use the assembler in SPIRV tools to get a program that works---or at least, the validation layers don't report the above error anymore. It would be nicer to not have to modify my assembly on the fly and re-assemble per-shader module though. :) |
OK. After reviewing the code, I now remember the reason for this restriction. As I said earlier, multi-stage modules can be supported, but they will require additional handling/logic. The issue is that, for the general instrumentation support which printf is built on, there is stage-specific info that is written out. So if there are any functions which are shared between call trees of two entry points for different kinds of shaders, those functions need to be cloned, one copy for each different stage it is involved in. Of course, if there is no shared functions between entry points of different stages in a module, the restriction does not need to be enforced for that module. We would just need to add a little logic to verify no functions are shared between entry points with different stage types. We would also need to add a little additional logic which can handle a different stage type for each entry point. This would be a good next step expansion of this capability as inlining is a common offline optimization for SPIR-V which happens to eliminate any possibility of shared functions. Would your use case be able to take advantage of this additional checking? If not, the next step would be to do the cloning. It would be fairly mechanical. Another approach is to observe that the printf mechanism should not really require stage-specific data to be written, and so plumb in logic into the general instrumentation mechanism to inhibit this stage-specific data to be written, if it isn't done already. This approach would bear a little more study if it is being considered. |
One of the motivating reasons we have for including multiple entry points within one program is that entry points can share common functions. For example, we might use the same quad software intersection test in a ray tracing intersection program as we do in a ray generation program for handling analytical light sources, and we'll share more general shading code is across closest hit programs for different geometry types. This keeps the code "DRY", and helps eliminate bugs that can arise from not updating all instances of the same function across otherwise separated programs. If inlining functions solves this issue, that would be fine with me. In OptiX, we're used to having to inline functions for other reasons, so that would be a familiar requirement. I personally lean towards your final observation long term---I'm not entirely sure why printf requires stage-specific data to be written out, though I might be missing some subtleties here with SPV. |
If avoiding code duplication is the main goal, could using |
A common HLSL file would help yes, but code de-duplication isn’t the main goal. The real goal is to replicate our current OptiX workflow as much as possible, where nvcc compiles all ray tracing shader entry points into one single PTX output, and printf’s can be used throughout the device code. I’m close to having this work by substituting nvcc for dxc and using the debug printf extension. This issue on multiple entry points disabling printf is the main hang up to achieving this goal. |
Quick update, unfortunately it seems my workaround of removing unnecessary OpEntryPoint instructions doesn’t work. My modified SPIRV passes validation with SPIRV Tools, but then attempting to create the shader module from the modified IR causes an access violation somewhere in the driver. So I’m a bit back to square one… If we can get this issue fixed that would be really helpful. At the moment though I’m forced to settle for no debug printf, unless I can figure out how to successfully modify the SPIRV to remove entry points on the fly. 😞 Edit: actually I might have figured out how to patch the SPIRV again such that it's accepted by Vulkan without crashing... it seems like in addition to the "OpEntryPoint" declaration lines, I also need to remove the entrypoint functions themselves, as well as the "OpName"s that seem to declare the functions. Edit 2: After regexing out the other entry points when building my individual ray tracing shader modules, I can confirm that the printf's work now. So the workaround here does work. So, until this is fixed, just for the sake of visibility, as a workaround you can :
That seems to work for me. Ideally I wouldn't have to do that, and I worry about robustness, but it does work. |
I am glad you figured out a workaround. But there appears to be a need for the printf capability to be enhanced in terms of shaders that are supported. From your reply, it appears that we should at least check for and handle the case where there is no shared functions between entry points. This condition can be accomplished with spirv-opt inlining. The other option is to create a stage-agnostic mode of instrumentation support and utilize that with printf. |
Agreed. I would gladly test this change too. spirv-opt inlining is completely fine with me. It seems like that wouldn't be too much work, though I'm unfamiliar with what's happening under the hood. Long term, I suspect the stage-agnostic approach will scale better. |
Looking more at the stage-agnostic solution, I believe this is not a good solution as I believe there is a good reason to keep the stage specificity. Since we possibly/likely are generating multiple printf statements, the printf instrumentation adds information to each message stating the stage and the instance of the stage that generated that message. This was initially something that was done as part of support for GPU-assisted validation. Since it seemed useful for printf as well, we kept it. Ripping this information out of debug printf would possibly cause compatibility issues, so I no longer support the stage-agnostic solution. Rather, I recommend the solution of checking for and cloning shared functions if necessary. |
That partially makes sense, but from the outside I'm still a bit confused... Just to clarify, why is it an issue for printf that multiple entry points share common functions? I get why printf might want to know which entry point called that printf. What I don't understand is why shared functions prevent this metadata from being attached to the printf. Perhaps you could just enable the printf's metadata to understand that some functions are going to be shared between entry points? |
Debug printf is built on instrumentation code that is stage-specific. We could possibly re-design the instrumentation to not be stage-specific, but it would add to size, speed and complexity for GPU-assisted validation and debug printf for the majority of codes out there. And it would be costly and disruptive. At the time the instrumentation was designed, and even today, stage-specificity is not a problem for the vast majority of codes since glslang is not generating multi-stage modules. In my estimation, the overall least cost, most maintainable, and best performing solution for the community is the one I recommend above. |
Hm, ok, just to clarify,
That common function needs to be inlined so that the printf has context as to which shader stage is calling it? I am ok with the solution you proposed. Better that then what I'm forced to do right now, where I have to modify my SPIR-V on the fly to get the Khronos validation layers to work. I'm mostly thinking about scalability in the future, eg, in a world where there are hundreds to thousands of compute entry points like we see in CUDA... |
Yes, commonFunction() would need to be inlined. This can be easily (?) done with spirv-opt --eliminate-dead-code-aggressive. The ? suggests it is possible that this inlining might not work due to some restriction, but fairly unlikely as it has been pretty well tested and generally works. We would still need to add logic to the debugPrintf pass to allow for multi-stage when no common function exists, and check that that condition is true. |
Sorry. Must have had a case of the Monday brain cramps. Meant to say: This can be easily (?) done with |
Sounds like a plan! :) Could this inlining be done with the dxc compiler? At the moment, I'm using dxc to compile multiple entry points into one .spv file, and then I use SPIR-V tools to tweak the final spv. Ideally I could drop SPIR-V tools from my repo and just use the output from dxc directly. Wondering if it's possible somehow to use "--inline-entry-points-exhaustive" with dxc then. Edit: Sorry, still learning some of these tools. Appears that spirv-opt is a binary that I'd run after using the dxc compiler to take hlsl -> spv. So then that would mean dxc wouldn't need to change, just need to add in spirv-opt into my build tools. I can create a couple easy test cases of when exhaustive inlining has been properly done vs not, and then use that to test out the modified logic to the debugPrintf pass. Update: reviewing -O in the spirv-optimizer, it appears that inline-entry-points-exhaustive is automatically included. I suspect that the majority of dxc hlsl->spv users use -O by default, so that gives more credit to this plan being relatively smooth to implement. |
Ok, so given this code:
Compiling with DXC and disabling optimization yields:
That %commonFunction is an issue for printf. If I allow dxc to use -O (which includes inline-entry-points-exhaustive) then I get this:
Here's the assembled spv for both: This one's the one with common functions inlined And here's one without. So, the next step would be to write that logic to reject the unoptimized one and allow for the optimized one. |
According to the extension spec, there are no clauses to rule out the use of printf from a SPIRV module that contains a mix of different entry-point types. I understand there might be some optimizations that allows things to be faster when there is not such use, but there are widespread undesired impact for low-level tools like the validation layer to make any further assumptions on users code beyond the spec. In this case, it is more appropriate to keep the current implementation to be a "fast" path but allow falling back to something more general and lenient when the assumption does not hold. |
The "stage-specific" info Greg mentioned is the A simple way out of this might be exploiting the fact if all the Entry Points in a multi-entrypoint are Ray Tracing stage, they all use One possible mechanism for solution could be to has the "which stage" value be a spec constant, and then at pipeline creation time, track it was a multi-entrypoint
Note this issue expands not only to Debug Printf, but to all GPU-AV logic, and since multi-entrypoint is more common with Ray Tracing stages, there will need to be some form a solution here |
The printf layer should not make any assumptions about what entry point stages exist in a program. If users need additional information about what entry point is calling a printf for GPU-AV, this should be an explicit extension of some sort. I agree that knowing what entry point is calling a printf could be useful for some specific GPU-AV problems, but it’s a heavy handed constraint that deviates from spec and is preventing printf from working for a much larger set of users who do not need this information. Users could also just as easily print out what entry point is calling the function as part of the printf message itself. As the spec stands today, it is perfectly legal to use printf anywhere in a SPV program containing a mix of compute shaders, a vertex and pixel shaders, raygen, closest hit and miss shaders, and any other type. (Unless I’m mistaken, in which case please point me to where this is stated…) Compiler developers at NVIDIA are depending on this being true as we push forward with mixed entry point codes with arbitrary stage types. Generating SPV should never be required to adhere to more constraints than what’s stated in the spec. |
Up until a couple months ago, the debug printf and GPU-AV instrumentation shared the same code for logging output data. GPU-AV does need stage information and since they shared the same code, debug printf got it 'for free' even though it wasn't needed. This commit broke that coupling. I've put up a SPIRV-Tools PR that removes the stage info from the debug printf instrumentation. I'm working on the corresponding VVL change to go with it. We'll still need to solve the multi-stage, multi-entrypoint shader problem for GPU-AV separately. |
Awesome. Sounds like things are moving in the right direction. Thank you for your help. |
Remove stage specific debug info that is only needed by GPU-AV. This allows debug printfs to be used in multi stage shader modules. Requires KhronosGroup/SPIRV-Tools#5495 Fixes KhronosGroup#4350
Remove stage specific debug info that is only needed by GPU-AV. This allows debug printfs to be used in multi stage shader modules. Requires KhronosGroup/SPIRV-Tools#5495 Fixes KhronosGroup#4350
Remove stage specific debug info that is only needed by GPU-AV. This allows debug printfs to be used in multi stage shader modules. Requires KhronosGroup/SPIRV-Tools#5495 Fixes KhronosGroup#4350
Remove stage specific debug info that is only needed by GPU-AV. This allows debug printfs to be used in multi stage shader modules. Requires KhronosGroup/SPIRV-Tools#5495 Fixes KhronosGroup#4350 Also includes the VUL linking change from KhronosGroup#7050
Remove stage specific debug info that is only needed by GPU-AV. This allows debug printfs to be used in multi stage shader modules. Requires KhronosGroup/SPIRV-Tools#5495 Fixes #4350 Also includes the VUL linking change from #7050
Fix some formatting and add an explanation of the changes made for Issue KhronosGroup#4350, just in case someone runs into them.
Fix some formatting and add an explanation of the changes made for Issue #4350, just in case someone runs into them.
Describe the Issue
I am trying to compile a collection of ray tracing entry points all from one .hlsl file into one .spv file, which I currently can do through the DXC compiler. I would also like to use "printf" for debugging. If I could do this, I could very closely mimic our current CUDA/OptiX workflow, but with Vulkan, and in a much more device-agnostic setup.
Unfortunately, when I try to create a shader module from this .spv while also using the GPU debug printf extension, I get the following error from vkCreateShaderModule: "Mixed stage shader module not supported"
At the moment, I'm therefore forced to either tell users to split every entry point into a separate .hlsl file---adding a tremendous amount of book keeping---or alternatively I must tell my users that we cannot use printf in any shaders due to the above error message.
Valid Usage ID
ERROR: [-1804403757][UNASSIGNED-Debug-Printf] : Validation Error: [ UNASSIGNED-Debug-Printf ] Object 0: handle = 0x2be7f7e7690, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x9472fbd3 | Error during shader instrumentation: line 0: Mixed stage shader module not supported
Environment:
Additional context
Here's a little shader that I'm trying to get working:
deviceCode.hlsl.txt
and here's the SPIR-V generated by dxc:
spirv.txt
I'm using the following DXC options:
-spirv
-fspv-target-env=vulkan1.1spirv1.4
-T lib_6_3
-fspv-extension=SPV_KHR_ray_tracing
-fspv-extension=SPV_KHR_ray_query
-fspv-extension=SPV_KHR_non_semantic_info
The text was updated successfully, but these errors were encountered: