-
-
Notifications
You must be signed in to change notification settings - Fork 35.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postprocessing: Add SSILVB GI and AO #29668
Comments
Hey, thanks for looking at my implementation of the SSILVB paper. I'm fairly confident the code works as a reference implementation, but there are a lot of details which aren't in the blog post (some of which I'm still figuring out). In terms of denoising and how to blend it properly with the rest of the lighting, I think is still an open question. Your demo looks pretty close and definitely a step up from GTAO. I probably won't update the blog until I release something, but you can try different sampling strategies, and also using an accumulation buffer over a couple frames. Here are some shots I took right now (since I've been tweaking things since the blog went up). Granted, the effect is a bit boosted (e.g. not exactly accurate) but I wanted it to be obvious as it's taking a chunk of the frametime. You probably have to open the images in a new tab to A/B test them, as the thumbnails don't really show it. I may make another post once it's 100% done, but I'd be interested to see what you discover as the demo you posted actually looks pretty promising and better in some ways to what I have now. Thanks. |
amazing |
As far as how the blending works... I'm just piggybacking off of the GTAO that @Rabbid76 wrote; it should be around here: |
Actually, scratch what I said earlier; it wasn't world vs. screen coordinates or the They were being packed 0-1 via In addition to creating that leftward shadow, it also happened to also make the scene have much softer lighting... I like the softer lighting, so I'll leave it in as a checkbox until I can figure out how to do it properly... Now just to figure out why it's casting rays left/right much farther than up/down... |
Oh yeah. I think I might have posted a bug in the code. This line is actually supposed to be this: |
Thanks! I think that doubles the sample efficiency 😄 I think that's all for obvious bugs... which is interesting, because I think I liked the aesthetic more when some of the bugs were left in 🤔 ; the results are much more similar to GTAO now (which, I suppose makes sense, given that it's meant to be a refinement on the concept). I'm setting up the SSTR3 Reference Unity Repo now to see how it compares... |
Here are some recordings: Afterward, I discovered this "Mip Optimization" flag was responsible for most of the flickering in the GI: MipOptimizationAliasing.mp4The ambient occlusion-only mode looks phenomenal; very few of the artifacts from the GI mode: SSRTAOTest.mp4The denoiser they're using isn't great; the one in three.js is much better, I believe 😄 With the Mip Optimization "Bug" fixed, it feels very nice; feels similar to how CryEngine used to feel HighestQuality.mp4With some temporal reprojection and the three.js denoiser, I bet it would feel pretty solid. Then, I tested the LittlestTokyo Scene: https://youtu.be/BGg_Z5icnl4 WOW! The effect that this shader has on this scene is truly unreal. The Unity version is definitely doing something my three.js version is not 😅 |
@zalo Do you mind implementing this feature with TSL? TBH, I don't think we want to add new post processing passes to the old effect composer since the new post processing in https://threejs.org/examples/?q=webgpu%20postprocessing E.g. the new motion blur and TRAA implementations are |
The TSL based GTAO implementation is here: https://github.com/mrdoob/three.js/blob/dev/examples/jsm/tsl/display/GTAONode.js |
I think I might have made a minor breakthrough... I decided to port the reference HLSL implementation from the SSRT3 repo and found an interesting bug... when it's accumulating the two halves of the horizon, only the second half of the accumulation comes out properly! One slice in the good hemisphere: One slice in the bad hemisphere: It seems like the problem is somewhere inside of the I don't see this two-part hemisphere accumulation in cybereality's code, so it's possible he already noticed something funny about this 😅 Anyways, if I accumulate 8 slices of only the working hemisphere, then the AO starts to look preeeetty good: Unlike the picture at the beginning of the thread, this one is also mostly correct (no weird shadows going off in one direction).... though, the thickness is turned up pretty high... The code is pretty messy right now... a battlefield of default "uniforms" defined in-line and failed debugging codepaths... but I'll see if I can get it pushed soon for folks to look at. EDIT: Just pushed it here: https://raw.githack.com/zalo/three.js/feat-ssilvb/examples/webgl_postprocessing_ssilvb.html If I resolve the hemisphere thing to my satisfaction, then I'll look at reflected light (GI), and then maybe a TSL port... |
Interesting. My original code was doing two horizons, but I found it wasn't needed (and doubles the amount of samples you need). How I think it should work is you have a vector (lets say the surface normal) and then you sample 180 degrees (Pi) centered around the normal. The horizon itself captures the hemisphere, so you don't need to explicitly check each side. But that was the most confusing part, and I did a lot of trial and error on those variables, so it's perhaps incorrect. |
After going back to the one based on your code, I think you're right, and I can see that they're pretty much doing the same thing (just with needing twice as many slices for the same number of samples and different default constants). The original code does guarantee that every left sample will be balanced by a right sample, which could have some subtle aesthetic quality though 🤔 (though, yours should also have it if there are an even number of slices)... However, I think this correction vs. The latter does seem like it's doing two samples really close to each other, so perhaps that's what the perceived bug was... try it out in your engine and to see if I made a mistake somewhere else 😅 I think the GI really shines when it's in a scene with both light and shadow nearby each other... neither of our test scenes seem very good for this... Also, fwiw this is the blending code I found in SSRT3: https://github.com/cdrinmatane/SSRT3/blob/main/HDRP/Shaders/Resources/SSRT.shader#L242-L267 |
Actually, you're right. It was a bug in my code. I was originally doing 2*PI and changed it later (after I published the blog post). It seems because I had a lot of jitter, it kind of compensated, but it was incorrect. Thanks a bunch for catching this. |
Alrighty, as of now, I have three versions: GLSL Port of SSRT3 (this is the best atm): https://raw.githack.com/zalo/three.js/feat-ssilvb/examples/webgl_postprocessing_ssilvb.html I assume there are subtle bugs in the way I did the ports that account for the differences; I was only able to step-by-step debug the SSRT3 one against a running reference implementation 😅 The TSL version is a little shiny for some reason... that'll have to wait for tomorrow... |
It also occurs to me that (when I figure out the GI half) this technique should probably also handle Screen Space Reflections (and maybe contact shadows), since it’s already tapping the textures in the right way… I guess that means it needs a roughness GBuffer 🧐 And, if we want to go crazy, we can probably solve the pop-in issues
by sampling from a stochastic depth buffer cubemap at the player’s head. 🤯 If anyone remembers my old stochastic depth buffering demo… The cost might be worth it if we can also accumulate temporal reprojection samples to surfaces outside of the FoV and around corners… 😅 Perhaps just multiple layered cubemaps for an approximate depth peeling that builds up and reprojects over time…. 🤯 |
I've thought about using this for other screen space techniques, but I'm not sure it's the best idea. Cause for light you mostly want to sample in an even uniform way. But for something like reflection, you are sampling in increasing increments. There are also ways do to SSR with Hi-Z that I don't think would be optimal for AO (but perhaps it could work, I didn't try it). |
I think reflective surfaces don't accumulate AO or GI in the same way that diffuse surfaces do. A mirror doesn't need AO or GI... there's a continuum between diffuse reflection and mirror reflection, so perhaps this algorithm can just tighten the cone on smoother surfaces? Hi-Z looks like fun (if a bit time-consuming to implement); I think it would work for SSILVB too. |
It’s seems like AO is in the air 😅 Time to find out if there are any tricks we’re not using yet… perhaps something good in here: EDIT: The author of the SSILVB/SAOVB shader I'm porting says it is indeed a straight up improvement: Thank goodness it is still licensed under MIT 😄 |
Had a go at porting the GT-VBAO shadertoy by @Mirko-Salm : https://raw.githack.com/zalo/three.js/feat-ssilvb-gtvbao/examples/webgl_postprocessing_ssilvb.html Observations compared to the SSRT3 (original "VBAO") version...
I think I'd need to see it with potentially a better(?) noise function or temporal antialiasing to truly compare them... I'm still annoyed at the popping artifacts (off the side of the screen and around corners)... likewise, defining a fixed "thickness" is rough too... stochastic depth buffer cubemaps would solve all of these (at the cost of 9-25x increased texture samples 💀 )... I just found a paper trying it, and it seems to do alright: |
"The noise function is different and the Poisson Denoiser doesn't like it..." - The particular choice of noise function is not integral to GT-VBAO. You can use whatever noise function you like as long as it follows a uniform distribution. In my experience interleaved gradient noise works better than R1-Hilbert noise for denoising. I chose R1-Hilbert noise as the default for the Shadertoy demo simply because its screen space characteristics are more isotropic than those of IGN (as long as you don't denoise). |
Thank you for the advice; I really appreciate the work that you've done on this! With interleaved gradient noise, the noise pattern now seems to match the other implementations... however, it still seems like the increased contrast from this (more correct, natural looking AO) is still too much for the current atemporal denoiser to smooth out.. 🫠 I did manage to tune the denoiser parameters a little bit to preserve edges better, but it's not quite enough without accumulation. The side-by-side is quite striking: For context again, that's:
Short (compressed) video futzing around with the two implementations: Screen.Recording.2024-10-30.140235.mp4three.js is getting a new temporal antialiasing technique that should be able to handle denoising over time, but I'm also considering the atemporal option (there are always folks who are up-in-arms at the temporal smearing artifacts) Considering the broader denoising landscape, there are only a few options:
And others that probably aren't viable:
I wonder how hard it would be to train a new neural denoiser on low sample-count ao, converged ao, normals, etc... |
Scratch the other denoiser suggestions; the three.js denoiser looks insanely good in comparison to them 😅 Enabling the temporal jitter with the denoiser starts to look really good on my 240hz monitor (biological TRAA 😅), so I'll just rest my hopes there for now... Thanks again @Mirko-Salm for publishing your implementation! |
I'm glad to see that people find it useful! |
@Mirko-Salm I’m glad you mentioned that; I actually switched it to the normalized version while porting since the first version didn’t render correctly in three.js 🫠 I suspect the depth unit is different enough (not-linear? reversed-z?) that applying the thickness to it directly behaved funnily 😅 As an aside; do you feel like the modifications to the VBAO would preclude adding GI functionality back into the shader? I’ll admit I can’t fully extrapolate whether the additional cosine weighting is just as good for reflected ambient illumination as it is for ambient occlusion… Also what do you think about doing contact shadows and maybe screen space reflections in the same pass? Also, according to a reply tweet, it seems like the GLSL/Githack version has a memory leak that could be artificially slowing it down… rather than troubleshoot it; I’ll just hope that will go away with the port to the simpler TSL system or WebGPU. 😅 |
it's just linear z from 0 to inf.
GI should also be possible. I haven't looked into it yet since I'm still busy with the uni-directional variant of GT-VBAO (having to always ray march bi-directionally is a bit of a downer). I can't really say how big of an improvement the cosine weighting would be for GI, but it shouldn't be hard to set up a reference GI ray marcher to get an upper bound on the possible quality improvements.
Hard to say if doing in it all in one pass is going to be beneficial. Seems like something you just have to try and profile. RE performance optimizations: try disabling #define USE_HQ_ACOS; the approximation should be good enough while being quite a bit faster. I only had the high quality variants on by default to show that the converged results exactly match those of the reference ray marcher. Disabling #define USE_HQ_APPROX_SLICE_IMPORTANCE_SAMPLING might not be worth it, though. EDIT: I have just added an improved version of ACos_Approx() to the GT-VBAO shadertoy code. The error introduced due to using the acos approximation is now so small that there is basically no reason not to do so (by disabling #define USE_HQ_ACOS). |
It's amazing to see those implementations in three.js! The GT-VBAO method is a great improvement over VBAO and makes it more physically correct (ie: cosine weighted vs uniform sampling). I think the visual difference should be more subtle though (maybe there is a sampling problem in the three.js SSRT3 implementation?). SSRT3 (three.js): SSRT3 (Unity implementation, from the video above with same parameters): GT-VBAO: It looks like a lot of details are missing in the three.js version. In any case, very nice work, thanks for sharing :) |
Thanks for bringing that up; that's a great observation! I only did line-by-line, side-by-side comparisons of the two implementations right up until the first slice (without noise/jitter), just to see if the inputs were the same handedness, but the odds are very good I messed up somewhere else in the Unity HLSL -> three.js GLSL port... The GTVBAO port looks much more similar to the SSRT3 Unity screenshot (though, iirc, the Unity screenshot also has some tonemapping/postprocessing I forgot to disable before recording, leading to the soft brown coloration 😅 ). It should also be noted that the three.js screenshots also have noise, cartoon outlines, and incorrect transparency... so I wonder if that accounts for the majority of the remaining visual difference between three-GTVBAO and unity-VBAO... I'll try pushing forward with the GTVBAO implementation for now and hope that it closes the visual fidelity gap after accounting for these differences 🧐 I'll include hedging notes here suggesting my port is buggy (and I'm sorry if three.js tweet inadvertently maligned SSRT3! It's an awesome package for Unity worth double the price 😄 And the SSRT3 scene looks so nice in AO; I want to live there... ) |
@Mirko-Salm Should I switch to the new Unidirectional Variant? What are the benefits? |
The benefit is that you cut the number of depth buffer samples in half without doing the same to the quality. Well, at least if your scene doesn't primarily consist of camera facing surfaces. In that case you lose about as much quality as you gain performance. Whether you would want to switch probably depends on whether marching a single direction per-pixel gives you sufficiently good results for the denoiser. |
Current Demo
Description
Recent advances in screen-space global illumination have yielded exceptional improvements to the realism and quality of real-time scenes, even over GTAO. One particular advancement is SSILVB, a screen-space global illumination technique that eases the computational burden by keeping track of the occluded horizon via a bitmask (allowing elements in the scene to have finite thickness and for more samples to be collected).
Solution
There are three MIT-Licensed implementations:
I have a flawed attempt at porting @cybereality 's here onto @Rabbid76's GTAO ( it's flawed because the samples seem unbalanced; AO-only, no illumination):
https://raw.githack.com/zalo/three.js/feat-ssilvb/examples/webgl_postprocessing_ssilvb.html
Alternatives
For reference, compare to the existing GTAO algorithm, which exhibits:
Additional context
These noisy GI techniques may benefit from screen-space temporal accumulation as well... But this is a request for another issue 😄
The text was updated successfully, but these errors were encountered: