Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardware accelerated floating point functions #39

Open
Kuratius opened this issue Apr 9, 2024 · 2 comments
Open

Hardware accelerated floating point functions #39

Kuratius opened this issue Apr 9, 2024 · 2 comments

Comments

@Kuratius
Copy link

Kuratius commented Apr 9, 2024

Kuratius@ace0e20

I tried implementing hardware accelerated fdiv and sqrtf functions, I also tried writing an fmul implementation that should be somewhat faster than the default.
There's also a rounding implementation that avoids float adds:
Kuratius@515fc97

Note that the fdiv and sqrtf use the hardware divider and square root unit while doing other stuff for the float handling instead of blocking; that may mean that handling of NaNs and other invalid values could be added without changing the performance of these functions.

Also note that they do not handle invalid values such as NaNs, overflows, underflows, and infs in a proper way.
I think most of the compiler option reshuffling isn't necessary, only the --use-blx --wrap and -u flags are a actually necessary to get it to use these functions instead of GCC's default soft float implementation.

I noticed some minor graphical issues (like 1 frame every few seconds that looks wrong) when using these, but it's hard to tell what causes that, if NaN handling is required or if something else is going wrong. Probably don't add these to the project by default unless that gets tested further.
This also should be benchmarked in more detail.

I'm opening this issue so that the discussion about this isn't hidden away in the discord.

@Kuratius
Copy link
Author

Kuratius commented Apr 9, 2024

if ((exponent<=(1<<23))){
This line in div.c should probably be < instead of <= strictly speaking, or just <=0

@Kuratius
Copy link
Author

blocksds/libnds@747890e
The hw sqrtf has been merged into blocksds and it now also has NaN support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant