-
Notifications
You must be signed in to change notification settings - Fork 246
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Better hmin/hmax algorithms for SSE/AVX2
Use a formulation that automatically produces the same result in all lanes, avoiding a separate broadcast step. The same approach would work with floats in principle, but it's not guaranteed to give the same result in all lanes when NaNs are involved (due to the way MINPS/MAXPS are defined), so leave the float versions alone for now. About 1% encode time reduction encoding a 8192x8192 test texture at 6x6 -thorough on a Ryzen 7950X3D.
- Loading branch information
Fabian Giesen
committed
Nov 1, 2024
1 parent
2ff200e
commit b79248a
Showing
2 changed files
with
16 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters