-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Quaddtype #98
Adding Quaddtype #98
Conversation
Heh. Over at openblas-lib we were just discussing what would happen if someone started using the libquadmath functions linked from gfortran in MacPython/openblas-libs#85. Will this eventually percolate into linalg calls? It would require a new build of OpenBLAS, since by default support is disabled with an admonition that the feature is not well implemented. How will this interact with numpy ufuncs, are you planning to add required loops? |
tlfloat is a library positioned completely differently from sleef. |
@mattip if we upstream this to NumPy (still lots of work before we can do that) we'd probably need to use SLEEF exclusively, which has the boost (BSD-like) license. The plan is to add ufuncs and casts in followups. If you'd like we should chat about this project sometime so you can meet Swayam. |
@shibatch thanks for your interest here! We'd love your opinion bout this work. The goal here is to maybe eventually replace numpy's longdouble dtype with something that doesn't consistently produce annoying platform-specific bugs that maintainers who do not care about longdouble need to fix. Also providing true 128 bit floats. NumPy has long had a float128 DType that is not really a float128. tlfloat seems like a good option for the MPFR DType @seberg wrote, if anyone wants to push that farther forward. |
I still don't understand, but are you guys interested in correctly-rounded results? One of the reasons I started developing tlfloat was the goal of creating a library that can produce proper correctly-rounded results. |
I guess there is interest to have something similar to Intels 80-bit longdouble support, that is higher precision compared to double but without to much perfomance impact compared to double. This is the reason for the existence of libraries like D.Bailey's double-double library or the now available libxprec library. There is also interest in full quad-precision. libquadmath supports this, but with a non-compatible license (LGPL) for scipy. It would be good to know wether tfloat's quads is more performant compared to sleef's quad. Personally I have no opinion about octupole precision. |
It's not correctly rounded results that we care about so much as a migration path for users of longdouble. Also a way to get higher precision floats from a cross-platform library. Agreed about the speed of SLEEF. Is there something faster we can use? That said, speed is less of a concern IMO than providing the functionality at all. Right now in NumPy if you're not on Linux you're out of luck if you need anything higher precision than 64 bit floats. We also talked about writing the code in such a way that you could select the float backend at runtime with a parameter for the dtype constructor, since that would help with benchmarking and experimentation. |
Oh, I misinterpreted your earlier message, tlfloat does implement quads. |
@ngoldbaum Thank you for your interest in sleef and tlfloat. I have not yet benchmarked the computation speed of tlfloat, but I believe it is similar to libquadmath in terms of the basic four arithmetic operations, since the internal processing is equivalent to libquadmath. For elementary functions, sleef should be significantly faster. Sleef internally performs operations in triple-double format, and the reason it is slower for the basic four arithmetic operations is due to the slow conversion between triple-double and IEEE format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of comments inline but it's awesome to see how much progress you've already made!
quaddtype/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update the contents here rather than just delete the README. No need to go into detail, just a sentence or two explaining what it is. It might also help to explain you need to install SLEEF and how to build against SLEEF (e.g. with environment variables or by installing it system-wide).
inline Sleef_quad | ||
to_quad<long double>(long double x) | ||
{ | ||
return Sleef_cast_from_doubleq1(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth looking in the SLEEF docs or implementation at whether x
needs to be memory-aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure here, but as handling earlier we can use memcpy
to copy the long double
value to an aligned buffer before passing it to the Sleef_cast_from_doubleq1
that way it can handle the non-alignment, something like this
alignas(long double) long double aligned_x;
memcpy(&aligned_x, &x, sizeof(long double));
return Sleef_cast_from_doubleq1(aligned_x);
|
||
loop_descrs[0] = PyArray_GetDefaultDescr(dtypes[0]); | ||
*view_offset = 0; | ||
return NPY_SAFE_CASTING; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it always safe? I don't actually know if, for example, all 64 bit ints are safely convertible to float128 without rounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that make sense looking on the sizes as,
A 64-bit integer has a maximum value of float128
format has a 113-bit significand (mantissa) so technically it should represent without rounding
Also like currently dealing with dtypes that all are smaller than float128 but the actual individual implementation might be different and may require rounding
npy_intp *view_offset) | ||
{ | ||
if (given_descrs[1] == NULL) { | ||
loop_descrs[1] = (PyArray_Descr *)new_quaddtype_instance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your DType isn't parametric so it's perfectly valid to create a singleton
instance and then return the singleton instead of creating a new instance whenever you need one.
|
||
// comparison functions | ||
|
||
template <cmp_def comp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to above, unnecessary template
return 0; | ||
} | ||
|
||
template <cmp_def comp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
|
||
assert np.abs(np.float64(quad_result) - float_result) < 1e-10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is testing much, let's chat more about how to test these ufuncs at our next meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't get a chance to talk about this last night. The idea I had was to use sympy and/or mpfloat to calculate these results with arbitrary precision, then make sure the result you get is within the expected error bounds (1-4 ULP depending on the operation IIRC in the SLEEF docs).
OK, tests are passing and I think all my comments have been addressed, let's bring this in. Let's try to keep followups more bite-sized to make reviewing these easier. |
This PR adds the initial support of SLEEF backed QuadDtype