Hardware accelerated floating point functions #39

Kuratius · 2024-04-09T15:20:12Z

Kuratius@ace0e20

I tried implementing hardware accelerated fdiv and sqrtf functions, I also tried writing an fmul implementation that should be somewhat faster than the default.
There's also a rounding implementation that avoids float adds:
Kuratius@515fc97

Note that the fdiv and sqrtf use the hardware divider and square root unit while doing other stuff for the float handling instead of blocking; that may mean that handling of NaNs and other invalid values could be added without changing the performance of these functions.

Also note that they do not handle invalid values such as NaNs, overflows, underflows, and infs in a proper way.
I think most of the compiler option reshuffling isn't necessary, only the --use-blx --wrap and -u flags are a actually necessary to get it to use these functions instead of GCC's default soft float implementation.

I noticed some minor graphical issues (like 1 frame every few seconds that looks wrong) when using these, but it's hard to tell what causes that, if NaN handling is required or if something else is going wrong. Probably don't add these to the project by default unless that gets tested further.
This also should be benchmarked in more detail.

I'm opening this issue so that the discussion about this isn't hidden away in the discord.

The text was updated successfully, but these errors were encountered:

Kuratius · 2024-04-09T22:07:01Z

if ((exponent<=(1<<23))){
This line in div.c should probably be < instead of <= strictly speaking, or just <=0

Kuratius · 2024-05-12T13:42:16Z

blocksds/libnds@747890e
The hw sqrtf has been merged into blocksds and it now also has NaN support.

Kuratius mentioned this issue Apr 29, 2024

Add hardware accelerated floating point square root #40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardware accelerated floating point functions #39

Hardware accelerated floating point functions #39

Kuratius commented Apr 9, 2024 •

edited

Loading

Kuratius commented Apr 9, 2024 •

edited

Loading

Kuratius commented May 12, 2024

Hardware accelerated floating point functions #39

Hardware accelerated floating point functions #39

Comments

Kuratius commented Apr 9, 2024 • edited Loading

Kuratius commented Apr 9, 2024 • edited Loading

Kuratius commented May 12, 2024

Kuratius commented Apr 9, 2024 •

edited

Loading

Kuratius commented Apr 9, 2024 •

edited

Loading