forked from gpuweb/cts
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from gpuweb:main #15
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The clamp validation test creates a 'foo' function that contains code to be checked. The entry point must call 'foo' in order for those override expressions to be checked. Bug: crbug.com/351378281
'values' test: * Infer result type instead of explicitly specifying it. If specified, it's used as the explicit type of the result variable. Infer it instead. This fixes cases where the type is abstract. * low == high is an error gpuweb/gpuweb#4616 'partial_eval_errs' test: * Test function foo() must be called from the entry point so that overrides are validated. 'early_eval_errs' test: * Fix argument order: low and high args are first and second. Bug: crbug.com/351378281
The tests query the GPUs mip to mip interpolation. The test that the query was successful was too strict so this PR relaxes that check.
For const cases, low < high is required See gpuweb/gpuweb#4616 Bug: crbug.com/351378281
Add variants where the function containing the function call being tested is both in and not-in the tested shader, i.e. statically accessed or not. This matters for override validation.
…er (#3906) This matters when exercising the'override' cases.
#3905) This matters when exercising the 'override' cases.
The issue is on Chrome and Firefox on Intel Mac, when sampling between 2 mip levels using `textureSampleLevel` in a compute shader, the weights used for mixing are very unexpected. The same issue doesn't happen in Safari TP so there is probably a fix. For now though, the same issue doesn't happen when using a fragment shader. So, switched to using a fragment shader to look up these weights. This is more appropriate for the current tests because the tests are running in fragment shaders. Will add an issue to test all stages.
I noticed when I click stop, sometimes it seems ignored. I think this is why.
This change reduces the number of shader modules created by 2 orders of magnitude. The issue is offsets must be constants so randomly generating them makes a new shader. This makes them less random so they are the same per test. So for example: webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_2d_coords:* goes from 1197 shader modules to 4
This is useful for CQs where bots have different hardware then devs have locally.
The old code did an lcm of width vs height where for for cubemaps because cubemaps must be square and textures must be a multiple of block sizes. With a format with a blockSize of 5x8 and a minSize of 32 that would end up doing lcm of 35x32 which is 1120. Then for a cube array it would end up allocating 1120x1120x24 and if the format is rgba32float that's 418meg. The new code just gets the lcm of the blockWidth vs blockHeight and then aligning to that which will be much much smaller.
Co-authored-by: Peter McNeeley <[email protected]>
* Tests subgroup_size and subgroup_invocation_id in fragment shaders
Remove the OT tokens for Chrome's initial WebGPU feature launch. crbug.com/358117283
textureGather needs to not choose centers of texels because depending on the backend it could choose the texels to the left or right (up or down) from the texture coordinate. There was code to do this but it needed to know it was being used for textureGather.
Each texture test tests 50 calls to `textureXXX`. It then checks they all match as expected. For each one that does NOT match expected a slow binary search happens to find the samples that influenced the results. This change makes it so only the first failure does this slow check. Ideally no tests would fail but when they do, checking all 50 sample points can cause test bots to timeout.
* Remaining subgroup validation tests * Adds must_use tests to ballot * Adds const id requirement to broadcast * Adds tests for other builtins: * broadcast first, * elect, * min, max, * and, or, xor, * shuffle, shuffle xor, shuffle up, shuffle down * quad broadcast * quad swap x, quad swap y, quad swap diagonal * Add enable validation tests to broadcast and ballot
The code wasn't correctly discarding edge cases for negative texture coordinates. For example -2.5 sits in the center of a texel and for textureGather could go either left or right. We need to avoid that case. Also, textureGather always does `nearest` filtering mipmapFilter. This was fine as it was since it would always pass 0 for mipLevel which meant the 2nd mip level always provided zero contribution but arguably it should take the `nearest` path.
See comments in texture_utils.ts
* Compute tests with all active invocation and partially active invocations * Removed unimplemented dynamically uniform subgroupBroadcast test (due to const requirement)
Previously Dawn didn't correctly validate that render pipelines with no attachments are errors. After fixing Dawn some CTS tests hit the new validation. Fix them up to follow the WebGPU spec. See #3754
I missed a case. This should fix it for now.
Adds tests that the padding between vec3h elements in an array is not disturbed.
The tests fill depth textures with random values except, if the comparision mode is 'equal' or 'not-equal' then we only put 0.0, 0.6, 1.0 in the texture. depthRef for all tests are 0.0, 0.5, or 1.0 so the comparisons should produce valid tests.
Test texture builtins on all stages. Previously only the fragment stage was tested. Note: Some of these are expected to fail on Intel Mac because in compute shaders, Intel Mac doesn't do bilinear interpolation between mip levels. At least not if not using argument buffers.
The derivative portion of mip level selection was being incorrectly quantized. That quanization is left over from a previous implementation but now, quantization happens when the mip level is chosen.
The default gradient mult used [1,1,1] and then set one of them the value that would generate the required mipLevel. But that was ignoring the that if the required mipLevel is less then 0 the 1s left over would generate a mipLevel of 0. For now, make the default [0,0,0]. Another alternative would be to set them all to the desired value that generates the desired mipLevel or, we could set them to something less than the value that generates the desired mip level.
…3998) Signed-off-by: sagudev <[email protected]> Co-authored-by: François Beaufort <[email protected]>
It used to be that mix weights for all stages were queried at initialization time. The problem is, on Intel Mac, the compute stage always fails. That means the majority of texture tests just would not run on Intel Mac, at least on Chrome and Firefox at the moment. Changing the code to load them per stage on demand means the texture builtin tests can run for compute and fragment stages on Intel Mac. You could argue this doesn't need to be fixed since Chrome and Firefox should fix their texture code on Intel Mac but I think at the moment there are a bunch more high priority things being worked on so it's best not to lose coverage for compute and fragment shaders.
compat is allowed to have no storage buffers but the texture-builin tests were relying on storage buffers both for inputs and outputs. For inputs, switching to a uniform buffer is fine. There are 50 calls with at most 5 parameters each aligned to 16 bytes so that's 4000 bytes which fits within the minimum uniform block size. For outputs, switching to writing output to a texture via the fragment shader works but in order not to have to change how derivatives work, instead we render 1 instance at a time and use setViewport to choose which texel to write to. We were using @Builtin(position) in the fragment shader and expecting it to be 0.5, 0.5 but since we're writing to different fragments now we have to subtract the instance index (v.ndx) from position.x to get it back to 0.5, 0.5.
Several GPUs have bugs with offset. Making it a case isntead of a subcase means it's easier to suppress these failures without losing other coverage.
Depth textures were excluded originally because there was no way to fill one via copyBufferToTexture. We added a way to fill them later which was used for the textureLoad tests. Now going back and enabling them for other tests. Note that might be better to move minFilter to a case instead of a subcase but that would break all expectations in CQs so it seemed best to filter these in the test itself.
identifySamplePoints works by doing a binary search filling a texture with black (0,0,0,0) and white (1,1,1,1) texels and then sampling it. Any non-zero results means those white pixels were sampled. This doesn't work for comparisons like textureSampleCompare, textureSampleCompareLevel, and textureGatherCompare because the result of those are 0 or 1 so for example, of the comparison is 'always' then all texels will show up as sampled. So, instead, if the builtin being tested is a comparison we convert the call to the corresponding non-comparsion builtin. * textureSampleCompare -> textureSample * textureSampleCompareLevel -> textureSampleLevel * textureGatherCompare -> textureGather This lets us find the sample points as best we can (it assumes those functions sample the same texels). Once we have the sample point we then want to look up the actual values of the texels and print them out. To do this requires reading the texture back from the GPU. We made the texture ourselves so we could maybe theoretically pass the data we sent to the GPU down to identifySamplePoints but it seems good to get the values from the GPU itself so at least they made a round trip through the GPU When, if it's a comparison, we print out the result of each comparison with that texel. Hopefully this will help us identify why these tests don't pass on some devices.
No idea why the pre-sumbit didn't catch this 🤷♂️
…4001) Many of these tests have too many subcases, which run in parallel and can run out of resources. We can use batching to break tests into more cases (without having to break them up arbitrarily by the numeric values passed to builtins). Not all of the :values: tests actually have more than 125 subcases; if they don't, then `batch` has no effect. I added `batch` to all of them anyway so that when tests are added in the future, they'll serve as an example and get copied over. Hopefully helps with https://crbug.com/373478528 and https://crbug.com/373485785
* Add float32-blendable feature validation tests * apply suggestion * roll types --------- Co-authored-by: Kai Ninomiya <[email protected]>
I forgot that we could adjust maxSubcasesInFlight to avoid problems with there being too many subcases in a case. This reverts commit ec54937 as well as changes the default value for maxSubcasesInFlight.
Even in 2024 Google/Chromium's infra can't handle unicode 🤬
* Tests subgroupAnd, subgroupOr, and subgroupXor * data types * compute * fragment
* Data types * compute tests: uniform and split * fragment tests: uniform
* Data types * compute: uniform and non-uniform * fragment: uniform
2 issues this solves 1. Show all the failures instead of only the first failure This was an issue because I'd fix the one failure only to later find there were worse cases. Just the worst case is also not enough. 2. Show a representation of each test The issue where is whatever is printing the `Error` message truncates the list so when I printed them all and there were more than about 20 because of so many permutations it would cut off the list, defeating the point of (1) above. So, opted to show one query per test of each length. Example: Error: Generated test variant would produce too-long -actual.txt filename. Possible solutions: - Reduce the length of the parts of the test query - Reduce the parameterization of the test - Make the test function faster and regenerate the listing_meta entry - Reduce the specificity of test expectations (if you're using them) webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-8x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";* Error: Generated test variant would produce too-long -actual.txt filename. Possible solutions: - Reduce the length of the parts of the test query - Reduce the parameterization of the test - Make the test function faster and regenerate the listing_meta entry - Reduce the specificity of test expectations (if you're using them) |<------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------>| webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-8x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-12x12-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-10x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="fragment";format="astc-12x12-unorm-srgb";dim="cube";filt="nearest";modeU="m";modeV="m";modeW="m";* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-12x12-unorm-srgb";dim="cube";filt="nearest";modeU="m";modeV="m";modeW="m";* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="linear";modeU="m";modeV="m";offset=false;* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="fragment";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false;* webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false;* webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_array_2d_coords:stage="fragment";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false
addressMode is broken on many GPUs. Moving this from a subcase to a case lets us separate out those failures. I opted to put the parameters in minFilter, addresMode, offset order which means for some builtins I moved minFilter from subcase to case. This turned into a massive change because the queryStrings got too long. so had to change * addressModeX -> modeX * filter to filt * clamp-to-edge -> c * repeat -> r * mirror-repeat -> m * viewDimension -> dim * fragment -> f * compute -> c * vertex -> v Other option would be to move all the tests from `webgpu:shader,execution,expression,call,builtin` to `webgpu:texture` though I'm not sure that would be enough on it's own Note: this takes `webgpu:shader,execution,expression,call,builtin,*` from 21738 cases to 113546 cases
* Test indexing of a matrix using non-const index Signed-off-by: sagudev <[email protected]> * fixup Signed-off-by: sagudev <[email protected]> --------- Signed-off-by: sagudev <[email protected]> Co-authored-by: Corentin Wallez <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )