Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adopting fast_double_parser #182

Open
lemire opened this issue Oct 26, 2020 · 4 comments
Open

Consider adopting fast_double_parser #182

lemire opened this issue Oct 26, 2020 · 4 comments

Comments

@lemire
Copy link

lemire commented Oct 26, 2020

I have no investigated the float parsing in daw_json_link, but I wanted to make sure you were aware that we have packaged the fast number parsing routine from the simdjson library into its own single-header library: https://github.com/lemire/fast_double_parser
This library provides exact parsing at high speed (under Linux, freeBSD, macOS, Visual Studio).

Feel free to close this issue if it is not relevant.

@beached
Copy link
Owner

beached commented Oct 26, 2020

Right now I have put off doing SIMD for the Real number parsing and have got good performance parsing all the available significant digits of the result type, parsing the exponent, then building from there as s * 10^e. This has the beauty of working at compile time. I see the other project of yours that was referenced for the from_chars interface https://github.com/lemire/fast_float probably fits better as I have no requirement of a trailing zero on the buffer being parsed.

I can try it out, but I would be unable to use it with the current license I don't think, my project is licensed under BSL1.0 and they are under Apache 2.0.

@lemire
Copy link
Author

lemire commented Oct 26, 2020

  1. The goal of these libraries is to provide exact parsing (to the nearest float). You can, of course, parse numbers faster if you don't aim for the nearest float... but the difference is surprisingly small...

  2. Yes, fast_float and fast_double_parser are similar.

  3. I don't think that there is any SIMD involved. I would love to parse numbers faster with SIMD instructions, but, honestly, I have not found a way that is worth it yet.

  4. If you are only blocked by the licence, I will gladly fix that for you. I use Apache by default, but it is meant to be super liberal. I do not wish to block anyone.

(Note that it is entirely up to you. I just wanted you to be aware of the option.)

@beached
Copy link
Owner

beached commented Oct 26, 2020

I would like to look at using compute_float_64 for the non-constexpr code path of result types of double/float. Right now I am getting about 900-1800MB/s depending on context of the parsing with an difference from strod of usually(2/3 of the time) 0ulp, about 1/3 of the time 1ulp, and rarely 2ulp. These were tested on many runs of 1 million random floating point numbers.

Due to the nature of the JSON parser if I have to skip the number, but need to parse it later, I use a different parser with information saved from the skip. Either way, the compute_float_64 function of yours having the property of exact parsing is super useful and with a license compatible with BSL1.0 could probably help.

@lemire
Copy link
Author

lemire commented Oct 26, 2020

@beached Added a secondary license just now (BSL1.0).

beached added a commit that referenced this issue May 11, 2021
path that sent to strtod as that requires the input source.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants