[Discussion] Trying to improve performance for `random_in_unit_sphere()` #765

define-private-public · 2020-10-13T02:40:57Z

define-private-public
Oct 13, 2020

I'm talking about this function:

raytracing.github.io/src/common/vec3.h

Line 137 in 1c9562f

inline vec3 random_in_unit_sphere() {

It can run for a long, very long time. I'm trying to find a way if I can replicate the results of it by getting rid of the while loop. For reference, here is how it looks now for the metal material:

At first, I tried being sneaky and uses a magic number, to ensure that the squared length was always 1 or less:

constexpr double magic_num = 0.57735;       // Ensures that we dont go above or below [-1, 1] for the dot product
return vec3::random(-magic_num, magic_num);

While this was technically returning a vector that was within a unit sphere, it wasn't covering all the cases that reference implementation could generate. It produced an image that looked like this:

I tried generating a vector that would be within the unit sphere, but I generated the values in spherical coordinates, and then would convert it to the Cartesian. Here is the code:

const double r = get_random(0, 1);
const double theta = get_random(0, Pi);
const double phi = get_random(0, TwoPi);

// Convert to the cartesian space
return vec3(
  r * std::sin(theta) * std::cos(phi),
  r * std::sin(theta) * std::sin(phi),
  r * std::cos(theta)
);

This didn't work either, and produced this image; looking more specular:

I even tried creating a pool of "in unit sphere" vectors (that would be generated with the book's code). It would generated a larger pool when it got exhausted. It's perf was slightly slower.

Does anyone have an idea of what could be a more performant implementation of random_in_unit_sphere() that tries to avoid (theoretically) infinite looping and maybe branching? Is my spherical->Cartesian incorrect?

define-private-public · 2020-10-27T03:57:38Z

define-private-public
Oct 27, 2020
Author

So I think I've figured out why this is happening. I plotted out some random points in the 2D case. First using the method in the book, and then the one using a polar method.

Randomly generated points are more clustered in the center in the "polar method", but more spread out at the edges. Looking at this, I think I may have devised a method to improve performance in the random_in_unit_sphere(). I'll share once I test it out. If you want the source to generate the above graphs, it's here:

https://gist.github.com/define-private-public/f51ddbfc5e5c91dacbc73fa87517af23

0 replies

genmeblog · 2020-11-06T13:46:57Z

genmeblog
Nov 6, 2020

Try this: http://extremelearning.com.au/how-to-generate-uniformly-random-points-on-n-spheres-and-n-balls/

0 replies

define-private-public · 2020-11-17T03:45:16Z

define-private-public
Nov 17, 2020
Author

@genmeblog

Thanks for the link, it was a good read. The only method that seemed to be more performant for me was Method 16 (the polar for a 3D space). But it wasn't producing the same (or very similar) results visually, it was a bit off. To some it may not be noticeable on an initial glance.

If you render book 2's final scene and look at the fuzzy metal sphere off to the right; once with the random_in_unit_sphere() from the book, and once with Method 16 (from your link). Then toggle between the two fast enough, you'll see the difference. And it's different enough to make me think the the rays are bouncing in an alternative distribution.

0 replies

genmeblog · 2020-11-17T09:09:09Z

genmeblog
Nov 17, 2020

@define-private-public

Can't argue here since I can't proove the formula for 3D space. I don't know why u is uniformly distributed, imho it should be cos. So, maybe when you replace sqrt(1-u^2) with sin(theta) and u with cos(theta) where theta is random(PI) things will look better? (r is cbrt(random()) - so this part is correct).

0 replies

define-private-public · 2020-11-19T05:40:21Z

define-private-public
Nov 19, 2020
Author

Hi, do you mean this:

r = cbrt(random())
phi = random(2Pi)
theta = random(pi)

x = r * cos(phi) * sin(theta)
y = r * sin(phi) * sin(theta)
z = r * cos(theta)

If so, this is what I got, which is not correct:

0 replies

genmeblog · 2020-11-19T16:37:05Z

genmeblog
Nov 19, 2020

Right... Ok, so I can't help here. My blind shot was missed...

0 replies

trevordblack · 2020-11-19T17:39:12Z

trevordblack
Nov 19, 2020
Maintainer

https://math.stackexchange.com/questions/87230/picking-random-points-in-the-volume-of-sphere-with-uniform-probability

Cube root works with normal distribution

0 replies

trevordblack · 2020-11-19T18:03:44Z

trevordblack
Nov 19, 2020
Maintainer

double theta = random(0, 2*PI);
// Solves for clumping along the vertical axis
double v = random(0,1);
double phi = acos((2*v)-1);
// Solves for clumping toward the center
double r = cbrt(random(0,1));
double x= r * sin(phi) * cos(theta);
double y= r * sin(phi) * sin(theta);
double z= r * cos(phi);

0 replies

hollasch · 2020-11-20T19:30:59Z

hollasch
Nov 20, 2020
Maintainer

I don't want to squash the intellectual thought experiment here, but there's a practical elephant in this room.

The current approach is significantly faster in almost all cases than any analytical method so far proposed. We care much more about total render time than about optimizing the performance for any given call in the service of obtaining a single sample for a single pixel. That means that every time our random sampling returns an answer faster than the analytical approach, it effectively builds up a bank of faster runtimes to balance against any single call that is improbably much slower. You don't need to optimize any single call — you need to optimize the entire bank of calls that will be made in the service of a single render. Statistically, you've got your work cut out for you.

The elephant in the room I alluded to above is this: we are implementing a Monte Carlo renderer. Any improbably bad sequence of random values can tank any number of samples in our code — random_in_unit_sphere() is just one of them. A pathological sequence can similarly ruin our motion sampling, or our scattering, or our reflection/refraction decision, and so on. This is just the bargain we make when harnessing random sampling. Every million years or so, we get a pathologically bright pink image that flashes on our screen that alarms the cat that knocks the fishbowl onto our keyboard that spills the water that dribbles backwards off the desk into our computer case that shorts the power supply that sends current through the cord we had our toes wrapped around up our leg through our heart to our hand that was resting on the lava lamp for warmth and we die a horrible death. But it's OK; more often than not we get a pretty picture.

0 replies

define-private-public · 2020-11-20T22:01:34Z

define-private-public
Nov 20, 2020
Author

@trevordblack Thanks for the sugestion. I tried plugging that in, but it still doesn't look the same. The first image is the book's code for random_in_unit_spheres(), the second is using the one you provided. It's too far off to consider it an acceptable replacement.

If you can't see the difference, it's on the top-right part of the sphere. Load these up in an image viewer and toggle between them. I want to also note that it didn't render the image any faster, it was actually slower.

0 replies

define-private-public · 2020-11-20T22:08:10Z

define-private-public
Nov 20, 2020
Author

@hollasch I'm trying to find out if there is a way to get a random value that's within a unit sphere, that matches the output of the book's code (or looks very close to it), but doesn't require the branching that the book's code does. The branching is a real performance killer, so I'm trying to see if there is more efficient algorithm for getting a random in the unit sphere.

So far, that Method 16 I mentioned above has been the best candidate (faster), but I thought it was too far looking different from the books' method. This function also controls how subsequent rays bounce, which can radically change not only how the render looks, but the performance as well.

0 replies

hollasch · 2020-11-20T22:17:01Z

hollasch
Nov 20, 2020
Maintainer

Ah. One of the first comments said “... that tries to avoid (theoretically) infinite looping and maybe branching”, so I was focusing on the former goal. To avoid branching (but spending memory), what about choosing a random entry from a cache of, say, 1,000 truly random points in a circle?

0 replies

hollasch · 2020-11-20T22:22:01Z

hollasch
Nov 20, 2020
Maintainer

Or something circular-ish. Like randomly & proportionally choosing one of 20 horizontal circle-width stripes, and then selecting a random point from those boxes? Missing out on the points that the boxes can't cover is likely imperceptible.

0 replies

hollasch · 2020-11-20T22:51:02Z

hollasch
Nov 20, 2020
Maintainer

Hmmm. Here's a thought. Tessellate the circle into eight triangles (vertices at 0°, 45°, 90° and so on, on the circle). Then randomly choose a triangle using a cache of the points above (vertex[i], vertex[i-1], origin). From there, generate a random point in the triangle (see https://math.stackexchange.com/questions/18686/uniform-random-point-in-triangle).

Cost is something like 4 adds, 4 multiplies, 1 sqrt.

If you don't like the uncovered regions, increase tessellation until happy, at the expense of the size of your cached vertices.

0 replies

trevordblack · 2020-11-20T23:24:43Z

trevordblack
Nov 20, 2020
Maintainer

If speed is your priority, the sampling cache is the way to go.

0 replies

whydoubt · 2024-09-10T05:01:33Z

whydoubt
Sep 10, 2024
Collaborator

At this point, random_in_unit_spheres has been retired, and just random_unit_vector remains.
If we look at the performance of random_unit_vector, there is an alternative approach that seems more performant (it is for me at least).

My anecdotal experience, calling random_unit_vector 5*10^7 times:

Current rejection method: averaged 4.895 seconds over 10 runs
Alternate polar method: averaged 2.765 seconds over 10 runs

inline vec3 random_unit_vector() {
    auto z = random_double(-1, 1);
    auto axial_dist = std::sqrt(1 - z*z);
    auto theta = random_double(0, 2*pi);
    auto x = axial_dist * std::cos(theta);
    auto y = axial_dist * std::sin(theta);
    return vec3(x,y,z);
}

Note: I've done some profiling, and not that much time is actually spent in random_unit_vector anyway. So it doesn't seem that this would have a significant impact on overall performance.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Trying to improve performance for `random_in_unit_sphere()` #765

{{title}}

Replies: 16 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Discussion] Trying to improve performance for random_in_unit_sphere() #765

define-private-public Oct 13, 2020

Replies: 16 comments

define-private-public Oct 27, 2020 Author

genmeblog Nov 6, 2020

define-private-public Nov 17, 2020 Author

genmeblog Nov 17, 2020

define-private-public Nov 19, 2020 Author

genmeblog Nov 19, 2020

trevordblack Nov 19, 2020 Maintainer

trevordblack Nov 19, 2020 Maintainer

hollasch Nov 20, 2020 Maintainer

define-private-public Nov 20, 2020 Author

define-private-public Nov 20, 2020 Author

hollasch Nov 20, 2020 Maintainer

hollasch Nov 20, 2020 Maintainer

hollasch Nov 20, 2020 Maintainer

trevordblack Nov 20, 2020 Maintainer

whydoubt Sep 10, 2024 Collaborator

[Discussion] Trying to improve performance for `random_in_unit_sphere()` #765

define-private-public
Oct 13, 2020

define-private-public
Oct 27, 2020
Author

genmeblog
Nov 6, 2020

define-private-public
Nov 17, 2020
Author

genmeblog
Nov 17, 2020

define-private-public
Nov 19, 2020
Author

genmeblog
Nov 19, 2020

trevordblack
Nov 19, 2020
Maintainer

trevordblack
Nov 19, 2020
Maintainer

hollasch
Nov 20, 2020
Maintainer

define-private-public
Nov 20, 2020
Author

define-private-public
Nov 20, 2020
Author

hollasch
Nov 20, 2020
Maintainer

hollasch
Nov 20, 2020
Maintainer

hollasch
Nov 20, 2020
Maintainer

trevordblack
Nov 20, 2020
Maintainer

whydoubt
Sep 10, 2024
Collaborator