Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 3: Shixuan Fang #12

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions MY_BUILD/ZERO_CHECK.proj
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="16.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Target Name="Build" Inputs="C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\7038008a80b7806c9e4bb49bc892b370\generate.stamp.rule;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeCInformation.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeCXXInformation.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeCommonLanguageInclude.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeGenericSystem.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeInitializeConfigs.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeLanguageInformation.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeRCInformation.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeSystemSpecificInformation.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\CMakeSystemSpecificInitialize.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Compiler\CMakeCommonCompilerMacros.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Compiler\MSVC-C.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Compiler\MSVC-CXX.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Compiler\MSVC.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindCUDA.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindCUDA\run_nvcc.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindCUDA\select_compute_arch.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindOpenGL.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindPackageHandleStandardArgs.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\FindPackageMessage.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Platform\Windows-MSVC-C.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Platform\Windows-MSVC-CXX.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Platform\Windows-MSVC.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Platform\Windows.cmake;C:\Program Files\CMake\share\cmake-3.24\Modules\Platform\WindowsPaths.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\CMakeLists.txt;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\3.24.0-rc3\CMakeCCompiler.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\3.24.0-rc3\CMakeCXXCompiler.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\3.24.0-rc3\CMakeRCCompiler.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\3.24.0-rc3\CMakeSystem.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\cis565_path_tracer.dir\src\cis565_path_tracer_generated_pathtrace.cu.obj.depend;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\cmake\CUDAComputesList.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\cmake\FindGLEW.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\cmake\FindGLFW.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\cmake\FindGLM.cmake;C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\cmake\cuda_compute_capability.cpp" Outputs="C:\Users\ursai\Desktop\565\Project3-CUDA-Path-Tracer\MY_BUILD\CMakeFiles\generate.stamp">
<Message Text="Checking Build System" />
<Exec Command="setlocal&#10;&quot;C:\Program Files\CMake\bin\cmake.exe&quot; -SC:/Users/ursai/Desktop/565/Project3-CUDA-Path-Tracer -BC:/Users/ursai/Desktop/565/Project3-CUDA-Path-Tracer/MY_BUILD --check-stamp-list CMakeFiles/generate.stamp.list --vs-solution-file C:/Users/ursai/Desktop/565/Project3-CUDA-Path-Tracer/MY_BUILD/cis565_path_tracer.sln&#10;if %errorlevel% neq 0 goto :cmEnd&#10;:cmEnd&#10;endlocal &amp; call :cmErrorLevel %errorlevel% &amp; goto :cmDone&#10;:cmErrorLevel&#10;exit /b %1&#10;:cmDone&#10;if %errorlevel% neq 0 goto :VCEnd&#10;:VCEnd" />
</Target>
<Target Name="Clean" />
<Target Name="GetTargetPath" />
<Target Name="GetNativeManifest" />
<Target Name="GetCopyToOutputDirectoryItems" />
</Project>
Binary file added MY_BUILD/cornell.2022-12-02_13-04-25z.46samp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 10 additions & 0 deletions MY_BUILD/imgui.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[Window][Debug##Default]
Pos=60,60
Size=400,400
Collapsed=0

[Window][Path Tracer Analytics]
Pos=60,60
Size=339,65
Collapsed=0

129 changes: 124 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,130 @@ CUDA Path Tracer

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Shixuan Fang
- [LinkedIn](https://www.linkedin.com/in/shixuan-fang-4aba78222/)
* Tested on: Windows 11, i7-12700kf, RTX3080Ti (Personal)

### (TODO: Your README)
<p align="center">
<img src="/img/final_picture.png" width="400" height="400">
</p>

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
## Project Overview

In this project, I implemented a CUDA-based path tracer, which is very different from CPU path tracer. Pathtracing in CPU is used to be recursive, but since CUDA doesn't support recursion(or very slow on new GPUs), this path tracer uses a iterative way.
<p align="center">
<img style="float: right;" src="https://user-images.githubusercontent.com/54868517/194418589-a882f9be-abda-4ae0-afe7-5b39bb03791e.png" width="500" height="200">
</p>

Also, traditional CPU pathtracing is down pixel by pixel, but as the figure shows, doing the same algorithm(parallel by pixel) on GPU will cause different threads finish in different time and therefore cause many divergence. The solution, then, is to parallel by rays. We launch a kernel that traces **one** bounce for every ray from a ray pool, update the result, and then terminate rays with stream compaction


## Core Features

### 1. Basic shading

There are three basic shadings implemented --- pure diffuse, pure reflection, and both reflection and refraction. Diffuse is done by generating a new direction randomly across the hemisphere, reflection is done by reflecting the incident ray direction by the surface normal, and refraction is done by using Schlick's approximation to create Fresnel effect.
<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194421652-410c82f6-60c2-4ca4-8200-2039355fc622.png" width="300" height="300">
</p>

### 2. Acceleration

- Stream Compaction

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194587050-ec6e6061-d54c-47e0-a152-7444923fd8e4.png" width="550" height="300"/>
</p>

The first acceleration is to remove unnecessary rays that are terminated(remaining bounce = 0) by using Stream Compaction. In this project I used ```thrust::stable_partition``` to compact. As seen in the chart, stream compaction will give noticeable performance improvement, especially with higher object number. This is because there will be many rays shooting outside the scene and then be terminated.

- Sort rays by material type

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194594883-13248d7a-6478-4733-9ef8-01ec234ad0ea.png" width="550" height="300"/>
</p>

The second acceleration is to sort all rays by their material types, which can result in good memory access pattern for CUDA. In this project I used ```thrust::sort_by_key``` to sort. However, the result isn't as I expect, as sorting the material actually slow down the performance. I assume that sorting materials takes much longer than performance gain through memory coalescing.


- Cache the first bounce for re-use

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194600817-b4a46c4e-c241-4f3e-a6eb-920904a6a3b2.png" width="550" height="300"/>

</p>

The third acceleration is to cache the first bounce intersection data, therefore all other iterations can use the same data and save some performance. As seen in the graph, with different max ray depth there is always approximately 4 second performance inprovement.

### 3. Microfacet shading model

Microfacet shading model is a very popular pbr model, which works by statistically modeling the scattering of light from a large collection of microfacets. This model is based on Cook-Torrance Reflection Model, which is showed as the following equation.
<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194615898-b098f124-5801-405e-81b0-9f7c3949de12.png">
</p>

In this equation, D represents the distribution of all microfacets' normal distribution, G is the geometry term which discribes the shadowing and masking effects of the surface, and F is the Fresnel term which describes how many light are reflected. There are many ways to describe these three terms, and in this project I choose GGX approximation for normal distribution function, Smith's shadowing and masking term as approximation for G function, and Schlick's approximation for F function. In these functions, there are **2** important factors, **Metalness** and **Roughness**. The following two figures demonstrate how these two factors affect the result. In the left, all three bunnys have the same roughness but different metalness, and from right to left the bunny looks more like metal. In the right, all three bunnys have same metalness but different roughness, and from left to right the bunny reflects less light and therefore looks darker.

<p align="center">
<img src="img/diff_metalness.png" width="300" height="300"/>
<img src="img/diff_roughness.png" width="300" height="300"/>
</p>

However, in this model the energy is not conservative, which is because this model doesn't consider multiple bounces, shown in the following figure(from [Kulla](https://blog.selfshadow.com/publications/s2017-shading-course/))

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194618245-64e9fdaf-9901-4304-b361-8a455df197d2.png" width="600" height="250"/>
</p>

This can be solved using Kulla-Conty Approximation, but since it's a little complicated and needs pre-computation, I didn't implemented in this project.

### 4. Mesh loading with tinyObj

In order to test my microfacet shading model, I need to add more complex meshes and decided to use tinyObj to load ```OBJ``` format meshes. In order to speed up ray-mesh intersetion, I created a toggleable AABB bounding box for intersection culling.

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194611878-93621d64-c496-41a8-a1d8-5f08f3eb3ef0.png" width="550" height="300"/>
</p>

As seen in the chart, with AABB mesh culling there is some performance improvement, but it's not that huge as I expect. I think it's because the bounding box is very small anyway, so there isn't too much improvement.

### 5. Stochastic Sampled Antialiasing

According to [Stochastic Sampling](https://web.cs.wpi.edu/~matt/courses/cs563/talks/antialiasing/stochas.html), in stochastic sampling, every point has a finite probability of being hit. Stochastic sampling is extremely powerful as the visual system is much more sensitive to aliases than noise. The one that I choose to implement is Jittered, which is done by perturbing the position that camera shoot rays.

**Left: no antialiasing**
**right: with antialiasing**


<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194441398-835325e3-cfa5-4f2b-9d6f-3d99b6948547.png" width="300" height="300"/>
<img src="https://user-images.githubusercontent.com/54868517/194441429-c3a3363c-28a6-4d64-b558-8e3af246329d.png" width="300" height="300"/>
</p>

### 6. Physically-based Depth-of-field

In regular pathtracing, we assume that the camera is an ideal pinhole camera, which is not physically possible. In order to create a physically-based camera, a classic way is the **thin lens approximation**. The algorithm I choose to implement is adapted from [PBRT[6.2.3]](https://www.pbr-book.org/3ed-2018/Camera_Models/Projective_Camera_Models), and the idea is quite simple. We set another 2 parameters for the camera, which are **lensRadius** and **focalDistance**. We sample on a concentric disk based on the lensRadius, and then jitter the camera ray based on that sample position and focalDistance. This simple algorithem can actually create very beautiful results.

<p align="center">
<img src="https://user-images.githubusercontent.com/54868517/194445204-bee8a91c-3f3f-4e00-b104-3a88ff2d1b5e.png" width="300" height="300"/>
<img src="https://user-images.githubusercontent.com/54868517/194445206-c30230a2-3a20-41a0-856f-3ac8a5a502ae.png" width="300" height="300"/>
<img src="https://user-images.githubusercontent.com/54868517/194445205-6e306d7c-6ef7-45c5-a879-7495620d6a27.png" width="300" height="300"/>
</p>

### 7. Direct Lighting

This is done by taking the final ray(remaining bounce = 1) to a random point on an emissive object.
It's a little hard to tell the difference, but the left one has no direct lighting and the right one has direct lighting.
The right one is slightly brighter that the left one, and it also converges faster

<p align="center">
<img src="img/no_anti_alias.png" width="300" height="300"/>
<img src="img/direct_lighting.png" width="300" height="300"/>
</p>

## References

- https://www.pbr-book.org/3ed-2018/Reflection_Models/Microfacet_Models
- https://blog.selfshadow.com/publications/s2017-shading-course/
- https://www.pbr-book.org/3ed-2018/Camera_Models/Environment_Camera
- https://web.cs.wpi.edu/~matt/courses/cs563/talks/antialiasing/stochas.html
24 changes: 24 additions & 0 deletions external/include/cornell_box.mtl
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
newmtl white
Ka 0 0 0
Kd 1 1 1
Ks 0 0 0

newmtl red
Ka 0 0 0
Kd 1 0 0
Ks 0 0 0

newmtl green
Ka 0 0 0
Kd 0 1 0
Ks 0 0 0

newmtl blue
Ka 0 0 0
Kd 0 0 1
Ks 0 0 0

newmtl light
Ka 20 20 20
Kd 1 1 1
Ks 0 0 0
Loading