Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 5: Hanlin Sun #5

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
build/
97 changes: 92 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,97 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Hanlin
* [LinkedIn](https://www.linkedin.com/in/hanlin-sun-7162941a5/)
* Tested on: Windows 11, i7-12700H @ 2.30GHz 32GB, NVIDIA RTX 3070Ti

### (TODO: Your README)
# Overview

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![Overview](img/overview.gif)

This project involved implementing a physical grass simulation rendered in Vulkan, based on the paper [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) by Jahrmann and Wimmer. There were three major components to this project, including rendering the grass, applying forces to control grass movement, and culling blades of grass to improve rendering frame rate when possible.

# Features
## Grass Blade Structure
A model for the grass blades was presented in the paper, which uses Bezier curves to represent the shape of a blade. Bezier curves are also used to represent grass blades for rendering in this project. The model is replicated below:

![](img/blade_model.jpg)

Each Bezier curve has three control points:
* `v0`: the position of the grass blade on the geomtry
* `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector
* `v2`: a physical guide for which we simulate forces on

Additional information is passed into the compute shader.
* `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0`
* Orientation: the orientation of the grass blade's face (as an angle offset from the x axis)
* Height: the height of the grass blade
* Width: the width of the grass blade's face
* Stiffness coefficient: the stiffness of our grass blade, which will affect the force computations on our blade

This data is packed into four `vec4`s, such that `v0.w` holds orientation, `v1.w` holds height, `v2.w` holds width, and `up.w` holds the stiffness coefficient.

# Modeling Forces

## Gravity
Gravity drags down the tips of the blades, making them move towards the ground. Alone, gravity would cause all the grass to fall, so a counter force is needed to balance them. This is discussed next.

![Gravity](img/gravity.png)

## Restorative
After applying gravity force, adding the restoring force pushes the blade of grass back up, allowing it to "bounce" back up when the gravity push the blade down. Depending on the stiffness of the blade, the recovery can be slow or fast.(Here only gravity applied)

![Recover](img/recover.gif)

## Wind
After applying gravity and restore force, adding simple wind force can make it swing from side to side.Wind force is based on the cosine (along x) and sin (along z) of the current time, which makes the grass blade move in a circular fashion.

![Wind](img/wind.gif)

# Optimization
Three optimizations were added to increase the frame rate. Some were more effective than others, as discussed in the performance analysis section.

## Orientation Culling
If the viewing angle is perpendicular to the thin edge of the blade of grass, it will not be rendered. To view it's effect I suggest focusing on the singlr blade and check it's visibility when camera rotate.

![Orientation Cull](img/orientationCull.gif)

## Frustum Culling
If the grass blade world position is outside of the camera view frustum, then it will be culled. It is a little bit difficult to view that effect since the render pipeline will automatically cull the object outside the window(but if not using frustum culling, they will still be rendered and will cause performance lost). So I output the FPS value, by moving the camera forward you can see that the FPS become higher(Only frustum culling was opened), which means less blades was rendered. That result proves that blades outside the camera frustum was culled and this algorithm works.

![Frustum Cull](img/FrustumCull.gif)

## Distance Culling
The distance culling operation removes blades based on the distance of the blade from the camera (projected on to the ground plane). The blade distance is discretized into a distance level bucket, in which a certain percentage of blades will be culled. The further the distance level is from the camera, the more blades will be culled.

![Distance Cull](img/DistanceCull.gif)

# Performance Analysis
All tests below were performed with the camera situated above the plane, at an angle. `r = 10.0, theta = 0.f, phi = 0.0`.

## Impact of Blades
As expected, increasing the number of blades decreases the frame rate. This is because more blades needs to be computed, and more blades to send through the rendering pipeline. Adding more than 65536 blades causes us to not be able to see any of the blades anyways, so figures are only helpful in seeing the FPS hit. The graph below shows how the FPS decreases exponentially as number of blades is increased. From the graph, we can see that enable all forms of culling gives us a nearly 2.5x boost in most cases.

![NumBlades_FPS](img/NumBlade_FPS.png)

## Optimization

From these paragraphes we can see that the distance culling has the most significant effect, and view_frustum culling has the minimum effect, and the orientation culling was in the middle.
And what's more, the orientation culling and view_frustum culling FPS was very close to the original FPS. I guess doing culling process will also need computation, but after these culling method, only few of them was culled, so the effect was not that clear.

But the distance culling will cut off a lot of blades, which bring the most significent performance increase.

Separate graphs are used because increasing number of blades will cause large differences in FPS magnitude, which are hard to see on a single graph.

![NumBlades_FPS](img/16384.png)

![NumBlades_FPS](img/32768.png)

![NumBlades_FPS](img/65536.png)

![NumBlades_FPS](img/131072.png)

# References
* Getting camera eye from view matrix
* Invert to find camera space in terms of world space and take displacement in fourth column of homogeneous matrix
* https://www.3dgep.com/understanding-the-view-matrix/
Binary file added bin/Debug/vulkan_grass_rendering.exe
Binary file not shown.
Binary file added bin/Debug/vulkan_grass_rendering.pdb
Binary file not shown.
Binary file added img/131072.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/16384.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/32768.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/65536.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/DistanceCull.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/FrustumCull.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/NumBlade_FPS.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/gravity.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/orientationCull.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/overview.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/recover.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/wind.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/Blades.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#include <array>
#include "Model.h"

constexpr static unsigned int NUM_BLADES = 1 << 13;
constexpr static unsigned int NUM_BLADES = 1 << 18;
constexpr static float MIN_HEIGHT = 1.3f;
constexpr static float MAX_HEIGHT = 2.5f;
constexpr static float MIN_WIDTH = 0.1f;
Expand Down
189 changes: 174 additions & 15 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -195,9 +195,44 @@ void Renderer::CreateTimeDescriptorSetLayout() {
}

void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// DONE: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding

//Check the document
//Input Blades
VkDescriptorSetLayoutBinding inputGrassBladeLayoutBinding = {};
inputGrassBladeLayoutBinding.binding = 0;
inputGrassBladeLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
inputGrassBladeLayoutBinding.descriptorCount = 1;
inputGrassBladeLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
inputGrassBladeLayoutBinding.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding cullBladesLayoutBinding = {};
cullBladesLayoutBinding.binding = 1;
cullBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
cullBladesLayoutBinding.descriptorCount = 1;
cullBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
cullBladesLayoutBinding.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding numBladesLayoutBinding = {};
numBladesLayoutBinding.binding = 2;
numBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
numBladesLayoutBinding.descriptorCount = 1;
numBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
numBladesLayoutBinding.pImmutableSamplers = nullptr;

std::vector<VkDescriptorSetLayoutBinding> bindings = { inputGrassBladeLayoutBinding, cullBladesLayoutBinding, numBladesLayoutBinding };

// Create the descriptor set layout
VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) {
throw std::runtime_error("Failed to create descriptor set layout");
}
}

void Renderer::CreateDescriptorPool() {
Expand All @@ -215,7 +250,9 @@ void Renderer::CreateDescriptorPool() {
// Time (compute)
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate
// DONE: Add any additional types and counts of descriptors you will need to allocate
// Blades compute buffers
{VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,static_cast<uint32_t>(3*scene->GetBlades().size())},
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -318,8 +355,47 @@ void Renderer::CreateModelDescriptorSets() {
}

void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// DONE: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = {modelDescriptorSetLayout};
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

//allocate descriptor sets
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS)
{
throw std::runtime_error("Failed to allocate descriptor set");
}

// Configure the descriptors to refer to buffers

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());
for (uint32_t i = 0; i < scene->GetBlades().size(); i++)
{
//Pick up each blade
VkDescriptorBufferInfo bladeBufferInfo = {};
// bladeBufferInfo.buffer = scene->GetBlades()[i]->GetBladeBuffer();
bladeBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
bladeBufferInfo.offset = 0;
bladeBufferInfo.range = sizeof(ModelBufferObject);

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &bladeBufferInfo;
descriptorWrites[i].pImageInfo = nullptr;
descriptorWrites[i].pTexelBufferView = nullptr;
}
// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -358,8 +434,78 @@ void Renderer::CreateTimeDescriptorSet() {
}

void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades
// DONE: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers
//which will hold the grass blades, the culled grass blades, and the output number of grass blades

// 3 attribute
computeDescriptorSets.resize(scene->GetBlades().size());

//Describe the descriptor set
VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

//Allocate descriptor
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS)
{
throw std::runtime_error("Failed to allocate descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());
for (uint32_t i = 0; i < scene->GetBlades().size(); ++i)
{
VkDescriptorBufferInfo grassBladeBufferInfo = {};
grassBladeBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer();
grassBladeBufferInfo.offset = 0;
grassBladeBufferInfo.range = NUM_BLADES * sizeof(Blades);

VkDescriptorBufferInfo culledBladeBufferInfo = {};
culledBladeBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBladeBufferInfo.offset = 0;
culledBladeBufferInfo.range = NUM_BLADES * sizeof(Blades);

VkDescriptorBufferInfo numBladeBufferInfo = {};
numBladeBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladeBufferInfo.offset = 0;
numBladeBufferInfo.range = sizeof(BladeDrawIndirect);

descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 0].dstBinding = 0;
descriptorWrites[3 * i + 0].dstArrayElement = 0;
descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 0].descriptorCount = 1;
descriptorWrites[3 * i + 0].pBufferInfo = &grassBladeBufferInfo;
descriptorWrites[3 * i + 0].pImageInfo = nullptr;
descriptorWrites[3 * i + 0].pTexelBufferView = nullptr;

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].dstArrayElement = 0;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBladeBufferInfo;
descriptorWrites[3 * i + 1].pImageInfo = nullptr;
descriptorWrites[3 * i + 1].pTexelBufferView = nullptr;

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].dstArrayElement = 0;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladeBufferInfo;
descriptorWrites[3 * i + 2].pImageInfo = nullptr;
descriptorWrites[3 * i + 2].pTexelBufferView = nullptr;

}
// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -716,8 +862,8 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.module = computeShaderModule;
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };
// DONE: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout };

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -883,7 +1029,14 @@ void Renderer::RecordComputeCommandBuffer() {
// Bind descriptor set for time uniforms
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr);

// TODO: For each group of blades bind its descriptor set and dispatch
// DONE: For each group of blades bind its descriptor set and dispatch
for (int i = 0; i < computeDescriptorSets.size(); i++)
{
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout,
2, 1, &computeDescriptorSets[i], 0, nullptr);
//Check document
vkCmdDispatch(computeCommandBuffer,NUM_BLADES/WORKGROUP_SIZE,1,1);
}

// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
Expand Down Expand Up @@ -932,6 +1085,7 @@ void Renderer::RecordCommandBuffers() {
renderPassInfo.pClearValues = clearValues.data();

std::vector<VkBufferMemoryBarrier> barriers(scene->GetBlades().size());

for (uint32_t j = 0; j < barriers.size(); ++j) {
barriers[j].sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barriers[j].srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
Expand Down Expand Up @@ -975,14 +1129,14 @@ void Renderer::RecordCommandBuffers() {
for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model
// DONE: Uncomment this when the buffers are populated
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// DONE: Bind the descriptor set for each grass blades model
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);
// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
// DONE: Uncomment this when the buffers are populated
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down Expand Up @@ -1041,7 +1195,7 @@ void Renderer::Frame() {
Renderer::~Renderer() {
vkDeviceWaitIdle(logicalDevice);

// TODO: destroy any resources you created
// DONE: destroy any resources you created

vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast<uint32_t>(commandBuffers.size()), commandBuffers.data());
vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer);
Expand All @@ -1058,6 +1212,11 @@ Renderer::~Renderer() {
vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr);

// =============== Destroy added descriptorSetLayout =========================

vkDestroyDescriptorSetLayout(logicalDevice,computeDescriptorSetLayout,nullptr);
// =============== End =======================

vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr);

vkDestroyRenderPass(logicalDevice, renderPass, nullptr);
Expand Down
Loading