[Discussion] Trying to improve performance for random_in_unit_sphere()
#765
Replies: 16 comments
-
So I think I've figured out why this is happening. I plotted out some random points in the 2D case. First using the method in the book, and then the one using a polar method. Randomly generated points are more clustered in the center in the "polar method", but more spread out at the edges. Looking at this, I think I may have devised a method to improve performance in the https://gist.github.com/define-private-public/f51ddbfc5e5c91dacbc73fa87517af23 |
Beta Was this translation helpful? Give feedback.
-
Try this: http://extremelearning.com.au/how-to-generate-uniformly-random-points-on-n-spheres-and-n-balls/ |
Beta Was this translation helpful? Give feedback.
-
Thanks for the link, it was a good read. The only method that seemed to be more performant for me was Method 16 (the polar for a 3D space). But it wasn't producing the same (or very similar) results visually, it was a bit off. To some it may not be noticeable on an initial glance. If you render book 2's final scene and look at the fuzzy metal sphere off to the right; once with the |
Beta Was this translation helpful? Give feedback.
-
Can't argue here since I can't proove the formula for 3D space. I don't know why |
Beta Was this translation helpful? Give feedback.
-
Hi, do you mean this:
If so, this is what I got, which is not correct: |
Beta Was this translation helpful? Give feedback.
-
Right... Ok, so I can't help here. My blind shot was missed... |
Beta Was this translation helpful? Give feedback.
-
Cube root works with normal distribution |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
I don't want to squash the intellectual thought experiment here, but there's a practical elephant in this room. The current approach is significantly faster in almost all cases than any analytical method so far proposed. We care much more about total render time than about optimizing the performance for any given call in the service of obtaining a single sample for a single pixel. That means that every time our random sampling returns an answer faster than the analytical approach, it effectively builds up a bank of faster runtimes to balance against any single call that is improbably much slower. You don't need to optimize any single call — you need to optimize the entire bank of calls that will be made in the service of a single render. Statistically, you've got your work cut out for you. The elephant in the room I alluded to above is this: we are implementing a Monte Carlo renderer. Any improbably bad sequence of random values can tank any number of samples in our code — |
Beta Was this translation helpful? Give feedback.
-
@trevordblack Thanks for the sugestion. I tried plugging that in, but it still doesn't look the same. The first image is the book's code for If you can't see the difference, it's on the top-right part of the sphere. Load these up in an image viewer and toggle between them. I want to also note that it didn't render the image any faster, it was actually slower. |
Beta Was this translation helpful? Give feedback.
-
@hollasch I'm trying to find out if there is a way to get a random value that's within a unit sphere, that matches the output of the book's code (or looks very close to it), but doesn't require the branching that the book's code does. The branching is a real performance killer, so I'm trying to see if there is more efficient algorithm for getting a random in the unit sphere. So far, that Method 16 I mentioned above has been the best candidate (faster), but I thought it was too far looking different from the books' method. This function also controls how subsequent rays bounce, which can radically change not only how the render looks, but the performance as well. |
Beta Was this translation helpful? Give feedback.
-
Ah. One of the first comments said “... that tries to avoid (theoretically) infinite looping and maybe branching”, so I was focusing on the former goal. To avoid branching (but spending memory), what about choosing a random entry from a cache of, say, 1,000 truly random points in a circle? |
Beta Was this translation helpful? Give feedback.
-
Or something circular-ish. Like randomly & proportionally choosing one of 20 horizontal circle-width stripes, and then selecting a random point from those boxes? Missing out on the points that the boxes can't cover is likely imperceptible. |
Beta Was this translation helpful? Give feedback.
-
Hmmm. Here's a thought. Tessellate the circle into eight triangles (vertices at 0°, 45°, 90° and so on, on the circle). Then randomly choose a triangle using a cache of the points above (vertex[i], vertex[i-1], origin). From there, generate a random point in the triangle (see https://math.stackexchange.com/questions/18686/uniform-random-point-in-triangle). Cost is something like 4 adds, 4 multiplies, 1 sqrt. If you don't like the uncovered regions, increase tessellation until happy, at the expense of the size of your cached vertices. |
Beta Was this translation helpful? Give feedback.
-
If speed is your priority, the sampling cache is the way to go. |
Beta Was this translation helpful? Give feedback.
-
At this point, My anecdotal experience, calling random_unit_vector 5*10^7 times:
inline vec3 random_unit_vector() {
auto z = random_double(-1, 1);
auto axial_dist = std::sqrt(1 - z*z);
auto theta = random_double(0, 2*pi);
auto x = axial_dist * std::cos(theta);
auto y = axial_dist * std::sin(theta);
return vec3(x,y,z);
} Note: I've done some profiling, and not that much time is actually spent in |
Beta Was this translation helpful? Give feedback.
-
I'm talking about this function:
raytracing.github.io/src/common/vec3.h
Line 137 in 1c9562f
It can run for a long, very long time. I'm trying to find a way if I can replicate the results of it by getting rid of the while loop. For reference, here is how it looks now for the
metal
material:At first, I tried being sneaky and uses a magic number, to ensure that the squared length was always 1 or less:
While this was technically returning a vector that was within a unit sphere, it wasn't covering all the cases that reference implementation could generate. It produced an image that looked like this:
I tried generating a vector that would be within the unit sphere, but I generated the values in spherical coordinates, and then would convert it to the Cartesian. Here is the code:
This didn't work either, and produced this image; looking more specular:
I even tried creating a pool of "in unit sphere" vectors (that would be generated with the book's code). It would generated a larger pool when it got exhausted. It's perf was slightly slower.
Does anyone have an idea of what could be a more performant implementation of
random_in_unit_sphere()
that tries to avoid (theoretically) infinite looping and maybe branching? Is my spherical->Cartesian incorrect?Beta Was this translation helpful? Give feedback.
All reactions