Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from gpuweb:main #15

Merged
merged 99 commits into from
Oct 25, 2024
Merged

[pull] main from gpuweb:main #15

merged 99 commits into from
Oct 25, 2024

Conversation

pull[bot]
Copy link

@pull pull bot commented Aug 14, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

dneto0 and others added 30 commits August 13, 2024 15:05
The clamp validation test creates a 'foo' function that contains
code to be checked.  The entry point must call 'foo' in order
for those override expressions to be checked.

Bug: crbug.com/351378281
'values' test:
* Infer result type instead of explicitly specifying it.   If specified,
  it's used as the explicit type of the result variable.  Infer it
  instead.  This fixes cases where the type is abstract.
* low == high is an error
  gpuweb/gpuweb#4616

'partial_eval_errs' test:
* Test function foo() must be called from the entry point so
  that overrides are validated.

'early_eval_errs' test:
* Fix argument order: low and high args are first and second.

Bug: crbug.com/351378281
The tests query the GPUs mip to mip interpolation.
The test that the query was successful was too strict
so this PR relaxes that check.
For const cases, low < high is required
See gpuweb/gpuweb#4616

Bug: crbug.com/351378281
Add variants where the function containing the function call being
tested is both in and not-in the tested shader, i.e. statically
accessed or not.

This matters for override validation.
…er (#3906)

This matters when exercising the'override' cases.
#3905)

This matters when exercising the 'override' cases.
The issue is on Chrome and Firefox on Intel Mac,
when sampling between 2 mip levels using `textureSampleLevel`
in a compute shader, the weights used for mixing are
very unexpected. The same issue doesn't happen in Safari TP
so there is probably a fix.

For now though, the same issue doesn't happen when using
a fragment shader. So, switched to using a fragment
shader to look up these weights. This is more appropriate
for the current tests because the tests are running in
fragment shaders.

Will add an issue to test all stages.
I noticed when I click stop, sometimes it seems ignored. I think
this is why.
)

This affects 'override' cases in the 'partial_values' subtest.
This change reduces the number of shader modules created by
2 orders of magnitude. The issue is offsets must be constants
so randomly generating them makes a new shader. This makes
them less random so they are the same per test.

So for example: webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_2d_coords:*
goes from 1197 shader modules to 4
This is useful for CQs where bots have different hardware
then devs have locally.
The old code did an lcm of width vs height where for
for cubemaps because cubemaps must be square and
textures must be a multiple of block sizes.

With a format with a blockSize of 5x8 and a minSize
of 32 that would end up doing lcm of 35x32 which is 1120.
Then for a cube array it would end up allocating
1120x1120x24 and if the format is rgba32float that's 418meg.

The new code just gets the lcm of the blockWidth vs blockHeight
and then aligning to that which will be much much smaller.
Co-authored-by: Peter McNeeley <[email protected]>
* Tests subgroup_size and subgroup_invocation_id in fragment shaders
Remove the OT tokens for Chrome's initial WebGPU feature launch.

crbug.com/358117283
textureGather needs to not choose centers of texels because
depending on the backend it could choose the texels to the
left or right (up or down) from the texture coordinate.

There was code to do this but it needed to know it was being
used for textureGather.
Each texture test tests 50 calls to `textureXXX`. It then
checks they all match as expected. For each one that does NOT match
expected a slow binary search happens to find the samples that
influenced the results.

This change makes it so only the first failure does this slow
check. Ideally no tests would fail but when they do, checking all
50 sample points can cause test bots to timeout.
* Remaining subgroup validation tests

* Adds must_use tests to ballot
* Adds const id requirement to broadcast
* Adds tests for other builtins:
  * broadcast first,
  * elect,
  * min, max,
  * and, or, xor,
  * shuffle, shuffle xor, shuffle up, shuffle down
  * quad broadcast
  * quad swap x, quad swap y, quad swap diagonal
* Add enable validation tests to broadcast and ballot
The code wasn't correctly discarding edge cases for negative texture
coordinates.  For example -2.5 sits in the center of a texel and
for textureGather could go either left or right. We need to avoid that
case.

Also, textureGather always does `nearest` filtering mipmapFilter.
This was fine as it was since it would always pass 0 for mipLevel
which meant the 2nd mip level always provided zero contribution
but arguably it should take the `nearest` path.
See comments in texture_utils.ts
* Compute tests with all active invocation and partially active
  invocations
* Removed unimplemented dynamically uniform subgroupBroadcast test (due
  to const requirement)
Previously Dawn didn't correctly validate that render pipelines with no
attachments are errors. After fixing Dawn some CTS tests hit the new
validation. Fix them up to follow the WebGPU spec.

See #3754
greggman and others added 29 commits October 7, 2024 18:45
I missed a case. This should fix it for now.
Adds tests that the padding between vec3h elements in an array is not
disturbed.
The tests fill depth textures with random values
except, if the comparision mode is 'equal' or 'not-equal'
then we only put 0.0, 0.6, 1.0 in the texture. depthRef
for all tests are 0.0, 0.5, or 1.0 so the comparisons
should produce valid tests.
Test texture builtins on all stages. Previously
only the fragment stage was tested.

Note: Some of these are expected to fail on Intel Mac
because in compute shaders, Intel Mac doesn't do
bilinear interpolation between mip levels. At least
not if not using argument buffers.
The derivative portion of mip level selection
was being incorrectly quantized. That quanization
is left over from a previous implementation but
now, quantization happens when the mip level is
chosen.
The default gradient mult used [1,1,1] and then set one
of them the value that would generate the required mipLevel.
But that was ignoring the that if the required mipLevel is
less then 0 the 1s left over would generate a mipLevel of 0.

For now, make the default [0,0,0]. Another alternative
would be to set them all to the desired value that generates
the desired mipLevel or, we could set them to something less
than the value that generates the desired mip level.
It used to be that mix weights for all stages were queried
at initialization time. The problem is, on Intel Mac, the
compute stage always fails. That means the majority of texture
tests just would not run on Intel Mac, at least on Chrome and
Firefox at the moment.

Changing the code to load them per stage on demand means
the texture builtin tests can run for compute and fragment
stages on Intel Mac.

You could argue this doesn't need to be fixed since Chrome and
Firefox should fix their texture code on Intel Mac but I think
at the moment there are a bunch more high priority things
being worked on so it's best not to lose coverage for compute
and fragment shaders.
compat is allowed to have no storage buffers
but the texture-builin tests were relying on storage
buffers both for inputs and outputs.

For inputs, switching to a uniform buffer is fine.
There are 50 calls with at most 5 parameters each
aligned to 16 bytes so that's 4000 bytes which fits
within the minimum uniform block size.

For outputs, switching to writing output to a texture
via the fragment shader works but in order not to have
to change how derivatives work, instead we render
1 instance at a time and use setViewport to choose
which texel to write to. We were using @Builtin(position)
in the fragment shader and expecting it to be 0.5, 0.5
but since we're writing to different fragments now we
have to subtract the instance index (v.ndx) from position.x
to get it back to 0.5, 0.5.
Several GPUs have bugs with offset. Making it a case
isntead of a subcase means it's easier to suppress these
failures without losing other coverage.
Depth textures were excluded originally because there
was no way to fill one via copyBufferToTexture. We added
a way to fill them later which was used for the textureLoad
tests. Now going back and enabling them for other tests.

Note that might be better to move minFilter to a case instead
of a subcase but that would break all expectations in CQs
so it seemed best to filter these in the test itself.
identifySamplePoints works by doing a binary search
filling a texture with black (0,0,0,0) and white (1,1,1,1)
texels and then sampling it. Any non-zero results means
those white pixels were sampled.

This doesn't work for comparisons like textureSampleCompare,
textureSampleCompareLevel, and textureGatherCompare because
the result of those are 0 or 1 so for example, of the comparison
is 'always' then all texels will show up as sampled.

So, instead, if the builtin being tested is a comparison
we convert the call to the corresponding non-comparsion builtin.

* textureSampleCompare -> textureSample
* textureSampleCompareLevel -> textureSampleLevel
* textureGatherCompare -> textureGather

This lets us find the sample points as best we can (it assumes
those functions sample the same texels).

Once we have the sample point we then want to look up the actual
values of the texels and print them out. To do this requires
reading the texture back from the GPU. We made the texture ourselves
so we could maybe theoretically pass the data we sent to the GPU
down to identifySamplePoints but it seems good to get the values
from the GPU itself so at least they made a round trip through the
GPU

When, if it's a comparison, we print out the result of each
comparison with that texel. Hopefully this will help us identify
why these tests don't pass on some devices.
No idea why the pre-sumbit didn't catch this 🤷‍♂️
…4001)

Many of these tests have too many subcases, which run in parallel and
can run out of resources. We can use batching to break tests into more
cases (without having to break them up arbitrarily by the numeric values
passed to builtins).

Not all of the :values: tests actually have more than 125 subcases; if
they don't, then `batch` has no effect. I added `batch` to all of them
anyway so that when tests are added in the future, they'll serve as an
example and get copied over.

Hopefully helps with https://crbug.com/373478528
and https://crbug.com/373485785
* Add float32-blendable feature validation tests

* apply suggestion

* roll types

---------

Co-authored-by: Kai Ninomiya <[email protected]>
I forgot that we could adjust maxSubcasesInFlight to avoid problems with
there being too many subcases in a case.

This reverts commit ec54937 as well as
changes the default value for maxSubcasesInFlight.
Even in 2024 Google/Chromium's infra can't handle unicode 🤬
* Tests subgroupAnd, subgroupOr, and subgroupXor
  * data types
  * compute
  * fragment
* Data types
* compute tests: uniform and split
* fragment tests: uniform
* Data types
* compute: uniform and non-uniform
* fragment: uniform
2 issues this solves

1. Show all the failures instead of only the first failure

   This was an issue because I'd fix the one failure only to later
   find there were worse cases. Just the worst case is also not
   enough.

2. Show a representation of each test

   The issue where is whatever is printing the `Error` message
   truncates the list so when I printed them all and there were
   more than about 20 because of so many permutations it would
   cut off the list, defeating the point of (1) above.

   So, opted to show one query per test of each length.

Example:

            Error: Generated test variant would produce too-long -actual.txt filename. Possible solutions:
              - Reduce the length of the parts of the test query
              - Reduce the parameterization of the test
              - Make the test function faster and regenerate the listing_meta entry
              - Reduce the specificity of test expectations (if you're using them)
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-8x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";*

            Error: Generated test variant would produce too-long -actual.txt filename. Possible solutions:
              - Reduce the length of the parts of the test query
              - Reduce the parameterization of the test
              - Make the test function faster and regenerate the listing_meta entry
              - Reduce the specificity of test expectations (if you're using them)
            |<------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------>|
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-8x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-12x12-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-10x8-unorm-srgb";dim="cube";filt="linear";modeU="m";modeV="m";modeW="m";*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="fragment";format="astc-12x12-unorm-srgb";dim="cube";filt="nearest";modeU="m";modeV="m";modeW="m";*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_3d_coords:stage="compute";format="astc-12x12-unorm-srgb";dim="cube";filt="nearest";modeU="m";modeV="m";modeW="m";*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="linear";modeU="m";modeV="m";offset=false;*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="fragment";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false;*
            webgpu:shader,execution,expression,call,builtin,textureSampleGrad:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false;*
            webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_array_2d_coords:stage="fragment";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false
            webgpu:shader,execution,expression,call,builtin,textureSampleLevel:sampled_array_2d_coords:stage="compute";format="astc-12x12-unorm-srgb";filt="nearest";modeU="m";modeV="m";offset=false
addressMode is broken on many GPUs. Moving this from a subcase
to a case lets us separate out those failures.

I opted to put the parameters in minFilter, addresMode, offset
order which means for some builtins I moved minFilter from subcase
to case.

This turned into a massive change because the queryStrings got too long.
so had to change

* addressModeX -> modeX
* filter to filt
* clamp-to-edge -> c
* repeat -> r
* mirror-repeat -> m
* viewDimension -> dim
* fragment -> f
* compute -> c
* vertex -> v

Other option would be to move all the tests from
`webgpu:shader,execution,expression,call,builtin` to
`webgpu:texture` though I'm not sure that would be enough on it's own

Note: this takes `webgpu:shader,execution,expression,call,builtin,*`
from 21738 cases to 113546 cases
* Test indexing of a matrix using non-const index

Signed-off-by: sagudev <[email protected]>

* fixup

Signed-off-by: sagudev <[email protected]>

---------

Signed-off-by: sagudev <[email protected]>
Co-authored-by: Corentin Wallez <[email protected]>
@teoxoy teoxoy merged commit 158caad into mozilla:main Oct 25, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.