-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Software renderer redesign part 3 #258
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also added default voxel transform ID for memory savings since a lot of voxels don't have special transformations.
Also use default voxel transform buffer ID in various places. Need to do entity transforms next.
Need to have a tree level populate all its child nodes with the same value if all visible or all not visible.
Doesn't crash which is a good sign. Would like to eventually make the broadcast function iterative instead of recursive.
Needed to differentiate between the 0-3 child index and the 0-end of the nodes on a tree level. My quadtree debug image is starting to show something resembling a bird's eye view, but still glitchy.
To be used with quadtree node look-up.
This completely fixes the quadtree visibility calculation as far as I can tell.
Putting the voxel and entity rendering code in separate files for ease of understanding.
Don't need to when there is a fully-enclosing sky mesh.
Spent a lot of time debugging inclusive and exclusive bin pixels. Each rasterizer bin is also like 14 MB which is huge and causes debug builds to chug whenever changing the resolution scale. I think it's bloated because of the sky being a special case high-density mesh.
This should be easier to multi-thread. Not sure about the possibility that rasterizer threads will have to synchronize after every 1-8 draw calls.
Tried several things before it worked without deadlocking. Performance seems heavily bottlenecked on g_totalDepthTests accumulation.
Fixes a huge performance issue with threads fighting over it.
These were being freed due to a bad alloc exception in the SoftwareRenderer before their manager had a chance to init().
Eventually want bin dimensions to vary with frame buffer resolution for better thread balancing.
It was making things too hard to understand while designing for multi-threading. Slight performance loss but will try to make it up later.
This reduces each bin from 14MB to about 300KB.
Black screen currently because workers aren't doing anything.
This is getting pretty complex.
Need to map each range of triangle indices to the worker's draw call index somehow.
Still need a way of iterating over each draw call and its rasterizer triangles.
Rendering is working again but multi-threaded is still slower than single-threaded, and deadlocks sometimes. Frustrating.
Still getting deadlocks, need to fix the condition variables etc..
Dealing with occasional deadlock though I think the problem is with workers that get 0 draw calls and are not waiting properly.
Caused by not properly setting all g_workers conditions.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This has several optimizations and bug fixes for the new renderer. Tl;dr:
VoxelVisibilityChunk
)EntityVisibilityChunk
)SkyVisibilityManager
)constexpr
,__restrict
to get compiler-generated vector instructions)I added multi-threading to the rasterizer today but performance still scales poorly, likely due to thread synchronization for draw calls, and each thread's workload not being big enough. Ideally FPS will be at a playable level for all PCs before this is merged. Just creating now for visibility.