forked from DTolm/VkFFT
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
HIP: add option to embed static blockDim
Using blockDim in hip kernels unfortunately incurs a large overhead, because this (dynamic) information is stored in the dispatch packet located in a host-coherent memory region. Since vkFFT always knows the work group size its going to use, just replace uses of blockDim with these values. This means the load from non-cached memory is avoided, the dispatch pointer doesn't have to be loaded which frees up 2 SGPRs, and some indexing calculations might constant fold better. The added option `useStaticWorkGroupSize` has three possible values: - -1: Disable embedding blockDim sizes, effectively the old behavior - 0: Automatically enable embedding when profitable (always except for RDNA2) - 1: Always enable RDNA is disabled by default because this can actually decrease performance sometimes with the reason not fully known, details at [1] [1]: ROCm/hipamd#53 Co-authored-by: [email protected]
- Loading branch information
Gergely Meszaros
committed
Feb 14, 2023
1 parent
41a4808
commit d9fb2cf
Showing
1 changed file
with
40 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters