You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried implementing hardware accelerated fdiv and sqrtf functions, I also tried writing an fmul implementation that should be somewhat faster than the default.
There's also a rounding implementation that avoids float adds: Kuratius@515fc97
Note that the fdiv and sqrtf use the hardware divider and square root unit while doing other stuff for the float handling instead of blocking; that may mean that handling of NaNs and other invalid values could be added without changing the performance of these functions.
Also note that they do not handle invalid values such as NaNs, overflows, underflows, and infs in a proper way.
I think most of the compiler option reshuffling isn't necessary, only the --use-blx --wrap and -u flags are a actually necessary to get it to use these functions instead of GCC's default soft float implementation.
I noticed some minor graphical issues (like 1 frame every few seconds that looks wrong) when using these, but it's hard to tell what causes that, if NaN handling is required or if something else is going wrong. Probably don't add these to the project by default unless that gets tested further.
This also should be benchmarked in more detail.
I'm opening this issue so that the discussion about this isn't hidden away in the discord.
The text was updated successfully, but these errors were encountered:
Kuratius@ace0e20
I tried implementing hardware accelerated fdiv and sqrtf functions, I also tried writing an fmul implementation that should be somewhat faster than the default.
There's also a rounding implementation that avoids float adds:
Kuratius@515fc97
Note that the fdiv and sqrtf use the hardware divider and square root unit while doing other stuff for the float handling instead of blocking; that may mean that handling of NaNs and other invalid values could be added without changing the performance of these functions.
Also note that they do not handle invalid values such as NaNs, overflows, underflows, and infs in a proper way.
I think most of the compiler option reshuffling isn't necessary, only the --use-blx --wrap and -u flags are a actually necessary to get it to use these functions instead of GCC's default soft float implementation.
I noticed some minor graphical issues (like 1 frame every few seconds that looks wrong) when using these, but it's hard to tell what causes that, if NaN handling is required or if something else is going wrong. Probably don't add these to the project by default unless that gets tested further.
This also should be benchmarked in more detail.
I'm opening this issue so that the discussion about this isn't hidden away in the discord.
The text was updated successfully, but these errors were encountered: