From 9e0a685ade5078b43734495dae3cb3207d69d4e9 Mon Sep 17 00:00:00 2001 From: Werner Randelshofer Date: Sat, 4 Mar 2023 11:02:09 +0100 Subject: [PATCH] Remove redundant code. Improve documentation inside the code. --- README.md | 75 ++++++++++--------- .../JavaBigDecimalFromByteArray.java | 50 ++++++++++--- .../JavaBigDecimalFromCharArray.java | 49 ++++++++++-- .../JavaBigDecimalFromCharSequence.java | 48 +++++++++--- .../JavaBigIntegerFromByteArray.java | 4 +- .../JavaBigIntegerFromCharArray.java | 3 +- .../JavaBigIntegerFromCharSequence.java | 3 +- .../ParseDigitsTaskByteArray.java | 17 ++--- .../ParseDigitsTaskCharArray.java | 19 ++--- .../ParseDigitsTaskCharSequence.java | 17 ++--- ...avaBigIntegerFromByteArrayScalability.java | 47 ++++++------ 11 files changed, 204 insertions(+), 128 deletions(-) diff --git a/README.md b/README.md index 635ad5a6..fa0155d1 100644 --- a/README.md +++ b/README.md @@ -7,9 +7,9 @@ This is a Java port of Daniel Lemire's [fast_float](https://github.com/fastfloat This project provides parsers for `double`, `float`, `BigDecimal` and `BigInteger` values. The parsers are optimised for speed for the most common inputs. -The code in this project contains optimised versions for Java SE 1.8, 11, 17, 19 and 20-ea. +The code in this project contains optimised versions for Java SE 1.8, 11, 17, 19 and 20. The code is released in a single multi-release jar, which contains the code for all these versions -except 20-ea. +except 20. ## License @@ -33,19 +33,19 @@ import ch.randelshofer.fastdoubleparser.JavaBigIntegerParser; import ch.randelshofer.fastdoubleparser.JsonDoubleParser; class MyMain { - public static void main(String... args) { - double d = JavaDoubleParser.parseDouble("1.2345e135"); - float f = JavaFloatParser.parseFloat("1.2345f"); - BigDecimal bd = JavaBigDecimalParser.parseBigDecimal("1.2345"); - BigInteger bi = JavaBigIntegerParser.parseBigInteger("12345"); - double jsonD = JsonDoubleParser.parseDouble("1.2345e85"); - } + public static void main(String... args) { + double d = JavaDoubleParser.parseDouble("1.2345e135"); + float f = JavaFloatParser.parseFloat("1.2345f"); + BigDecimal bd = JavaBigDecimalParser.parseBigDecimal("1.2345"); + BigInteger bi = JavaBigIntegerParser.parseBigInteger("12345"); + double jsonD = JsonDoubleParser.parseDouble("1.2345e85"); + } } ``` -The `parse...()`-methods take a `CharacterSequence`. a `char`-array or a `byte`-array as argument. This way. you can +The `parse...()`-methods take a `CharacterSequence`. a `char`-array or a `byte`-array as argument. This way, you can parse from a `StringBuffer` or an array without having to convert your input to a `String`. Parsing from an array is -faster. because the parser can process multiple characters at once using SIMD instructions. +faster, because the parser can process multiple characters at once using SIMD instructions. ## Performance Tuning @@ -85,7 +85,7 @@ using a divide-and-conquer algorithm. Small sequences of digits are converted individually to bit sequences and then gradually combined to the final bit sequence. This algorithm needs to perform multiplications of very long bit sequences. The multiplications are performed in the frequency domain using a discrete fourier transform. -The multiplications in the frequency domain can be performed in `O(Nlog N (log log N))` time, +The multiplications in the frequency domain can be performed in `O(N log N (log log N))` time, where `N` is the number of digits. In contrast, conventional multiplication algorithms in the time domain need `O(N²)` time. @@ -206,34 +206,34 @@ on the same computer: model: generate random numbers uniformly in the interval [0.0.1.0] volume: 100000 floats volume = 2.09808 MB - netlib : 317.31 MB/s (+/- 6.0 %) 15.12 Mfloat/s 66.12 ns/f - doubleconversion : 263.89 MB/s (+/- 4.2 %) 12.58 Mfloat/s 79.51 ns/f - strtod : 86.13 MB/s (+/- 3.7 %) 4.10 Mfloat/s 243.61 ns/f - abseil : 467.27 MB/s (+/- 9.0 %) 22.27 Mfloat/s 44.90 ns/f - fastfloat : 880.79 MB/s (+/- 6.6 %) 41.98 Mfloat/s 23.82 ns/f - - OpenJDK 20-ea+22-1594 - java.lang.Double : 89.59 MB/s (+/- 6.0 %) 5.14 Mfloat/s 194.44 ns/f - JavaDoubleParser String : 485.97 MB/s (+/-13.8 %) 27.90 Mfloat/s 35.85 ns/f - JavaDoubleParser char[] : 562.55 MB/s (+/-10.0 %) 32.29 Mfloat/s 30.97 ns/f - JavaDoubleParser byte[] : 644.65 MB/s (+/- 8.7 %) 37.01 Mfloat/s 27.02 ns/f + netlib : 317.31 MB/s (+/- 6.0 %) 15.12 Mfloat/s 66.12 ns/f + doubleconversion : 263.89 MB/s (+/- 4.2 %) 12.58 Mfloat/s 79.51 ns/f + strtod : 86.13 MB/s (+/- 3.7 %) 4.10 Mfloat/s 243.61 ns/f + abseil : 467.27 MB/s (+/- 9.0 %) 22.27 Mfloat/s 44.90 ns/f + fastfloat : 880.79 MB/s (+/- 6.6 %) 41.98 Mfloat/s 23.82 ns/f + + OpenJDK 20+36-2344 + java.lang.Double : 93.97 MB/s (+/- 5.0 %) 5.39 Mfloat/s 185.43 ns/f 1.00 speedup + JavaDoubleParser String : 534.52 MB/s (+/-11.2 %) 30.67 Mfloat/s 32.60 ns/f 5.69 speedup + JavaDoubleParser char[] : 620.86 MB/s (+/- 9.9 %) 35.63 Mfloat/s 28.07 ns/f 6.61 speedup + JavaDoubleParser byte[] : 724.91 MB/s (+/- 5.7 %) 41.60 Mfloat/s 24.04 ns/f 7.71 speedup ' $ ./build/benchmarks/benchmark -f data/canada.txt # read 111126 lines volume = 1.93374 MB - netlib : 337.79 MB/s (+/- 5.8 %) 19.41 Mfloat/s 51.52 ns/f - doubleconversion : 254.22 MB/s (+/- 6.0 %) 14.61 Mfloat/s 68.45 ns/f - strtod : 73.33 MB/s (+/- 7.1 %) 4.21 Mfloat/s 237.31 ns/f - abseil : 411.11 MB/s (+/- 7.3 %) 23.63 Mfloat/s 42.33 ns/f - fastfloat : 741.32 MB/s (+/- 5.3 %) 42.60 Mfloat/s 23.47 ns/f - - OpenJDK 20-ea+29-2280 - java.lang.Double : 77.84 MB/s (+/- 4.1 %) 4.47 Mfloat/s 223.54 ns/f 1.00 speedup - JavaDoubleParser String : 329.79 MB/s (+/-13.4 %) 18.95 Mfloat/s 52.77 ns/f 4.24 speedup - JavaDoubleParser char[] : 521.30 MB/s (+/-15.2 %) 29.96 Mfloat/s 33.38 ns/f 6.70 speedup - JavaDoubleParser byte[] : 560.48 MB/s (+/-12.7 %) 32.21 Mfloat/s 31.05 ns/f 7.20 speedup + netlib : 337.79 MB/s (+/- 5.8 %) 19.41 Mfloat/s 51.52 ns/f + doubleconversion : 254.22 MB/s (+/- 6.0 %) 14.61 Mfloat/s 68.45 ns/f + strtod : 73.33 MB/s (+/- 7.1 %) 4.21 Mfloat/s 237.31 ns/f + abseil : 411.11 MB/s (+/- 7.3 %) 23.63 Mfloat/s 42.33 ns/f + fastfloat : 741.32 MB/s (+/- 5.3 %) 42.60 Mfloat/s 23.47 ns/f + + OpenJDK 20+36-2344 + java.lang.Double : 82.56 MB/s (+/- 4.4 %) 4.74 Mfloat/s 210.76 ns/f 1.00 speedup + JavaDoubleParser String : 366.27 MB/s (+/- 9.7 %) 21.05 Mfloat/s 47.51 ns/f 4.44 speedup + JavaDoubleParser char[] : 571.76 MB/s (+/-11.4 %) 32.86 Mfloat/s 30.43 ns/f 6.93 speedup + JavaDoubleParser byte[] : 622.03 MB/s (+/- 7.5 %) 35.75 Mfloat/s 27.98 ns/f 7.53 speedup # Building and running the code @@ -299,6 +299,8 @@ java -XX:CompileCommand=inline,java/lang/String.charAt -p fastdoubleparser/targe export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_281.jdk/Contents/Home java -XX:CompileCommand=inline,java/lang/String.charAt -cp "fastdoubleparser/target/*:fastdoubleparserdemo/target/*" ch.randelshofer.fastdoubleparserdemo.Main --markdown java -XX:CompileCommand=inline,java/lang/String.charAt -cp "fastdoubleparser/target/*:fastdoubleparserdemo/target/*" ch.randelshofer.fastdoubleparserdemo.Main --markdown FastDoubleParserDemo/data/canada.txt +java -XX:CompileCommand=inline,java/lang/String.charAt -cp "fastdoubleparser/target/*:fastdoubleparserdemo/target/*" ch.randelshofer.fastdoubleparserdemo.Main --markdown FastDoubleParserDemo/data/mesh.txt +java -XX:CompileCommand=inline,java/lang/String.charAt -cp "fastdoubleparser/target/*:fastdoubleparserdemo/target/*" ch.randelshofer.fastdoubleparserdemo.Main --markdown FastDoubleParserDemo/data/canada_hex.txt ``` ## IntelliJ IDEA with Java SE 8, 11, 17, 19 and 20 on macOS @@ -355,7 +357,7 @@ from the **-dev** module to the delta modules. ## Testing the code Unfortunately it is not possible to test floating parsers exhaustively, because the input -and output spaces are too big. +and output spaces are far too big. | Parser | Input Space | Output Space | |----------------------|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------| @@ -371,7 +373,8 @@ You can quickly run a number of hand-picked tests that aim for 100 % line covera mvn -DenableLongRunningTests=true test ``` -You can run additional tests with the following command. +You can run additional tests with the following command. The purpose of these tests is to explore additional +regions of the input and output spaces. ``` mvn -DenableLongRunningTests=true test diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromByteArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromByteArray.java index 3a48c2a8..e4c3dfc1 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromByteArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromByteArray.java @@ -12,7 +12,6 @@ import static ch.randelshofer.fastdoubleparser.FastIntegerMath.createPowersOfTenFloor16Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOfNFloor16Recursive; import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskByteArray.RECURSION_THRESHOLD; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskByteArray.parseDigits; /** @@ -298,27 +297,57 @@ BigDecimal parseBigDecimalStringWithManyDigits(byte[] str, int offset, int lengt ); } - + /** + * Parses a big decimal string after we have identified the parts of the significand, + * and after we have obtained the exponent value. + *
+     *       integerPartIndex
+     *       │  decimalPointIndex
+     *       │  │  nonZeroFractionalPartIndex
+     *       │  │  │  exponentIndicatorIndex
+     *       ↓  ↓  ↓  ↓
+     *     "-123.00456e-789"
+     *
+     * 
+ * + * @param str the input string + * @param integerPartIndex the start index of the integer part of the significand + * @param decimalPointIndex the index of the decimal point in the significand (same as exponentIndicatorIndex + * if there is no decimal point) + * @param nonZeroFractionalPartIndex the start index of the non-zero fractional part of the significand + * @param exponentIndicatorIndex the index of the exponent indicator (same as end of string if there is no + * exponent indicator) + * @param isNegative indicates that the significand is negative + * @param exponent the exponent value + * @return the parsed big decimal + */ private BigDecimal valueOfBigDecimalString(byte[] str, int integerPartIndex, int decimalPointIndex, int nonZeroFractionalPartIndex, int exponentIndicatorIndex, boolean isNegative, int exponent) { int fractionDigitsCount = exponentIndicatorIndex - decimalPointIndex - 1; int nonZeroFractionDigitsCount = exponentIndicatorIndex - nonZeroFractionalPartIndex; int integerDigitsCount = decimalPointIndex - integerPartIndex; NavigableMap powersOfTen = null; + // Parse the significand + // --------------------- + BigInteger significand; + + // If there is an integer part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. BigInteger integerPart; if (integerDigitsCount > 0) { if (integerDigitsCount > RECURSION_THRESHOLD) { powersOfTen = createPowersOfTenFloor16Map(); fillPowersOfNFloor16Recursive(powersOfTen, integerPartIndex, decimalPointIndex); - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, powersOfTen); + integerPart = ParseDigitsTaskByteArray.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, powersOfTen); } else { - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, null); + integerPart = ParseDigitsTaskByteArray.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, null); } } else { integerPart = BigInteger.ZERO; } - BigInteger significand; + // If there is a fraction part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. if (fractionDigitsCount > 0) { BigInteger fractionalPart; if (nonZeroFractionDigitsCount > RECURSION_THRESHOLD) { @@ -326,10 +355,11 @@ private BigDecimal valueOfBigDecimalString(byte[] str, int integerPartIndex, int powersOfTen = createPowersOfTenFloor16Map(); } fillPowersOfNFloor16Recursive(powersOfTen, nonZeroFractionalPartIndex, exponentIndicatorIndex); - fractionalPart = parseDigits(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, powersOfTen); + fractionalPart = ParseDigitsTaskByteArray.parseDigitsRecursive(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, powersOfTen); } else { - fractionalPart = parseDigits(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, null); + fractionalPart = ParseDigitsTaskByteArray.parseDigitsRecursive(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, null); } + // If the integer part is not 0, we combine it with the fraction part. if (integerPart.signum() == 0) { significand = fractionalPart; } else { @@ -340,8 +370,8 @@ private BigDecimal valueOfBigDecimalString(byte[] str, int integerPartIndex, int significand = integerPart; } - BigDecimal result = new BigDecimal(significand, -exponent); - return isNegative ? result.negate() : result; + // Combine the significand with the sign and the exponent + // ------------------------------------------------------ + return new BigDecimal(isNegative ? significand.negate() : significand, -exponent); } - } \ No newline at end of file diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharArray.java index 0024e8fa..b7fe79a4 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharArray.java @@ -12,7 +12,6 @@ import static ch.randelshofer.fastdoubleparser.FastIntegerMath.createPowersOfTenFloor16Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOfNFloor16Recursive; import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharArray.RECURSION_THRESHOLD; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharArray.parseDigits; /** @@ -297,25 +296,57 @@ BigDecimal parseBigDecimalStringWithManyDigits(char[] str, int offset, int lengt } + /** + * Parses a big decimal string after we have identified the parts of the significand, + * and after we have obtained the exponent value. + *
+     *       integerPartIndex
+     *       │  decimalPointIndex
+     *       │  │  nonZeroFractionalPartIndex
+     *       │  │  │  exponentIndicatorIndex
+     *       ↓  ↓  ↓  ↓
+     *     "-123.00456e-789"
+     *
+     * 
+ * + * @param str the input string + * @param integerPartIndex the start index of the integer part of the significand + * @param decimalPointIndex the index of the decimal point in the significand (same as exponentIndicatorIndex + * if there is no decimal point) + * @param nonZeroFractionalPartIndex the start index of the non-zero fractional part of the significand + * @param exponentIndicatorIndex the index of the exponent indicator (same as end of string if there is no + * exponent indicator) + * @param isNegative indicates that the significand is negative + * @param exponent the exponent value + * @return the parsed big decimal + */ private BigDecimal valueOfBigDecimalString(char[] str, int integerPartIndex, int decimalPointIndex, int nonZeroFractionalPartIndex, int exponentIndicatorIndex, boolean isNegative, int exponent) { int integerExponent = exponentIndicatorIndex - decimalPointIndex - 1; int fractionDigitsCount = exponentIndicatorIndex - nonZeroFractionalPartIndex; int integerDigitsCount = decimalPointIndex - integerPartIndex; NavigableMap powersOfTen = null; + + // Parse the significand + // --------------------- + BigInteger significand; + + // If there is an integer part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. BigInteger integerPart; if (integerDigitsCount > 0) { if (integerDigitsCount > RECURSION_THRESHOLD) { powersOfTen = createPowersOfTenFloor16Map(); fillPowersOfNFloor16Recursive(powersOfTen, integerPartIndex, decimalPointIndex); - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, powersOfTen); + integerPart = ParseDigitsTaskCharArray.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, powersOfTen); } else { - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, null); + integerPart = ParseDigitsTaskCharArray.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, null); } } else { integerPart = BigInteger.ZERO; } - BigInteger significand; + // If there is a fraction part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. if (fractionDigitsCount > 0) { BigInteger fractionalPart; if (fractionDigitsCount > RECURSION_THRESHOLD) { @@ -323,10 +354,11 @@ private BigDecimal valueOfBigDecimalString(char[] str, int integerPartIndex, int powersOfTen = createPowersOfTenFloor16Map(); } fillPowersOfNFloor16Recursive(powersOfTen, decimalPointIndex + 1, exponentIndicatorIndex); - fractionalPart = parseDigits(str, decimalPointIndex + 1, exponentIndicatorIndex, powersOfTen); + fractionalPart = ParseDigitsTaskCharArray.parseDigitsRecursive(str, decimalPointIndex + 1, exponentIndicatorIndex, powersOfTen); } else { - fractionalPart = parseDigits(str, decimalPointIndex + 1, exponentIndicatorIndex, null); + fractionalPart = ParseDigitsTaskCharArray.parseDigitsRecursive(str, decimalPointIndex + 1, exponentIndicatorIndex, null); } + // If the integer part is not 0, we combine it with the fraction part. if (integerPart.signum() == 0) { significand = fractionalPart; } else { @@ -337,7 +369,8 @@ private BigDecimal valueOfBigDecimalString(char[] str, int integerPartIndex, int significand = integerPart; } - BigDecimal result = new BigDecimal(significand, -exponent); - return isNegative ? result.negate() : result; + // Combine the significand with the sign and the exponent + // ------------------------------------------------------ + return new BigDecimal(isNegative ? significand.negate() : significand, -exponent); } } \ No newline at end of file diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharSequence.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharSequence.java index 5790a105..e6206983 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharSequence.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigDecimalFromCharSequence.java @@ -12,7 +12,6 @@ import static ch.randelshofer.fastdoubleparser.FastIntegerMath.createPowersOfTenFloor16Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOfNFloor16Recursive; import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharSequence.RECURSION_THRESHOLD; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharSequence.parseDigits; /** @@ -298,27 +297,57 @@ BigDecimal parseBigDecimalStringWithManyDigits(CharSequence str, int offset, int return valueOfBigDecimalString(str, nonZeroIntegerPartIndex, decimalPointIndex, nonZeroFractionalPartIndex, exponentIndicatorIndex, isNegative, (int) exponent); } - + /** + * Parses a big decimal string after we have identified the parts of the significand, + * and after we have obtained the exponent value. + *
+     *       integerPartIndex
+     *       │  decimalPointIndex
+     *       │  │  nonZeroFractionalPartIndex
+     *       │  │  │  exponentIndicatorIndex
+     *       ↓  ↓  ↓  ↓
+     *     "-123.00456e-789"
+     *
+     * 
+ * + * @param str the input string + * @param integerPartIndex the start index of the integer part of the significand + * @param decimalPointIndex the index of the decimal point in the significand (same as exponentIndicatorIndex + * if there is no decimal point) + * @param nonZeroFractionalPartIndex the start index of the non-zero fractional part of the significand + * @param exponentIndicatorIndex the index of the exponent indicator (same as end of string if there is no + * exponent indicator) + * @param isNegative indicates that the significand is negative + * @param exponent the exponent value + * @return the parsed big decimal + */ private BigDecimal valueOfBigDecimalString(CharSequence str, int integerPartIndex, int decimalPointIndex, int nonZeroFractionalPartIndex, int exponentIndicatorIndex, boolean isNegative, int exponent) { int fractionDigitsCount = exponentIndicatorIndex - decimalPointIndex - 1; int nonZeroFractionDigitsCount = exponentIndicatorIndex - nonZeroFractionalPartIndex; int integerDigitsCount = decimalPointIndex - integerPartIndex; NavigableMap powersOfTen = null; + // Parse the significand + // --------------------- + BigInteger significand; + + // If there is an integer part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. BigInteger integerPart; if (integerDigitsCount > 0) { if (integerDigitsCount > RECURSION_THRESHOLD) { powersOfTen = createPowersOfTenFloor16Map(); fillPowersOfNFloor16Recursive(powersOfTen, integerPartIndex, decimalPointIndex); - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, powersOfTen); + integerPart = ParseDigitsTaskCharSequence.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, powersOfTen); } else { - integerPart = parseDigits(str, integerPartIndex, decimalPointIndex, null); + integerPart = ParseDigitsTaskCharSequence.parseDigitsRecursive(str, integerPartIndex, decimalPointIndex, null); } } else { integerPart = BigInteger.ZERO; } - BigInteger significand; + // If there is a fraction part, we parse it using a recursive algorithm. + // The recursive algorithm needs a map with powers of ten, if we have more than RECURSION_THRESHOLD digits. if (fractionDigitsCount > 0) { BigInteger fractionalPart; if (nonZeroFractionDigitsCount > RECURSION_THRESHOLD) { @@ -326,9 +355,9 @@ private BigDecimal valueOfBigDecimalString(CharSequence str, int integerPartInde powersOfTen = createPowersOfTenFloor16Map(); } fillPowersOfNFloor16Recursive(powersOfTen, nonZeroFractionalPartIndex, exponentIndicatorIndex); - fractionalPart = parseDigits(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, powersOfTen); + fractionalPart = ParseDigitsTaskCharSequence.parseDigitsRecursive(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, powersOfTen); } else { - fractionalPart = parseDigits(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, null); + fractionalPart = ParseDigitsTaskCharSequence.parseDigitsRecursive(str, nonZeroFractionalPartIndex, exponentIndicatorIndex, null); } if (integerPart.signum() == 0) { significand = fractionalPart; @@ -340,7 +369,8 @@ private BigDecimal valueOfBigDecimalString(CharSequence str, int integerPartInde significand = integerPart; } - BigDecimal result = new BigDecimal(significand, -exponent); - return isNegative ? result.negate() : result; + // Combine the significand with the sign and the exponent + // ------------------------------------------------------ + return new BigDecimal(isNegative ? significand.negate() : significand, -exponent); } } \ No newline at end of file diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromByteArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromByteArray.java index 531bd113..f03eeaaf 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromByteArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromByteArray.java @@ -9,7 +9,6 @@ import java.util.Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOf10Floor16; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskByteArray.parseDigits; class JavaBigIntegerFromByteArray extends AbstractNumberParser { public final static int MAX_INPUT_LENGTH = 1_292_782_622; @@ -136,7 +135,7 @@ private BigInteger parseManyDecDigits(byte[] str, int from, int to, boolean isNe throw new NumberFormatException(VALUE_EXCEEDS_LIMITS); } Map powersOfTen = fillPowersOf10Floor16(from, to); - BigInteger result = parseDigits(str, from, to, powersOfTen); + BigInteger result = ParseDigitsTaskByteArray.parseDigitsRecursive(str, from, to, powersOfTen); return isNegative ? result.negate() : result; } @@ -149,5 +148,4 @@ private int skipZeroes(byte[] str, int from, int to) { } return from; } - } diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharArray.java index 80236b18..db737a05 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharArray.java @@ -8,7 +8,6 @@ import java.util.Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOf10Floor16; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharArray.parseDigits; class JavaBigIntegerFromCharArray extends AbstractNumberParser { public final static int MAX_INPUT_LENGTH = 1_292_782_622; @@ -130,7 +129,7 @@ private BigInteger parseManyDecDigits(char[] str, int from, int to, boolean isNe throw new NumberFormatException(VALUE_EXCEEDS_LIMITS); } Map powersOfTen = fillPowersOf10Floor16(from, to); - BigInteger result = parseDigits(str, from, to, powersOfTen); + BigInteger result = ParseDigitsTaskCharArray.parseDigitsRecursive(str, from, to, powersOfTen); return isNegative ? result.negate() : result; } diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharSequence.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharSequence.java index 5ccc3f4b..e5d5ac6b 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharSequence.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JavaBigIntegerFromCharSequence.java @@ -8,7 +8,6 @@ import java.util.Map; import static ch.randelshofer.fastdoubleparser.FastIntegerMath.fillPowersOf10Floor16; -import static ch.randelshofer.fastdoubleparser.ParseDigitsTaskCharSequence.parseDigits; class JavaBigIntegerFromCharSequence extends AbstractNumberParser { public final static int MAX_INPUT_LENGTH = 1_292_782_622; @@ -128,7 +127,7 @@ private BigInteger parseManyDecDigits(CharSequence str, int from, int to, boolea throw new NumberFormatException(VALUE_EXCEEDS_LIMITS); } Map powersOfTen = fillPowersOf10Floor16(from, to); - BigInteger result = parseDigits(str, from, to, powersOfTen); + BigInteger result = ParseDigitsTaskCharSequence.parseDigitsRecursive(str, from, to, powersOfTen); return isNegative ? result.negate() : result; } diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskByteArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskByteArray.java index 50c0fa62..fb3039f4 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskByteArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskByteArray.java @@ -36,17 +36,8 @@ private ParseDigitsTaskByteArray() { */ public static final int RECURSION_THRESHOLD = 400; - static BigInteger parseDigits(byte[] str, int from, int to, Map powersOfTen) { - int numDigits = to - from; - if (numDigits < RECURSION_THRESHOLD) { - return parseDigitsIterative(str, from, to); - } else { - return parseDigitsRecursive(str, from, to, powersOfTen); - } - } - /** - * Parses digits in exponential time O(e^n). + * Parses digits in quadratic time O(N2). */ static BigInteger parseDigitsIterative(byte[] str, int from, int to) { int numDigits = to - from; @@ -68,7 +59,11 @@ static BigInteger parseDigitsIterative(byte[] str, int from, int to) { } /** - * Parses digits in exponential time O(e^n). + * Parses digits in O(N log N (log log N)) time. + *

+ * A conventional recursive algorithm would require O(N1.5). + * We achieve better performance by performing multiplications of long bit sequences + * in the frequencey domain. */ static BigInteger parseDigitsRecursive(byte[] str, int from, int to, Map powersOfTen) { int numDigits = to - from; diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharArray.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharArray.java index 1a5c7e23..fb6cfc62 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharArray.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharArray.java @@ -1,5 +1,5 @@ /* - * @(#)ParseDigitsTaskCharArray.java + * @(#)java * Copyright © 2023 Werner Randelshofer, Switzerland. MIT License. */ package ch.randelshofer.fastdoubleparser; @@ -37,17 +37,8 @@ private ParseDigitsTaskCharArray() { static final int RECURSION_THRESHOLD = 400; - static BigInteger parseDigits(char[] str, int from, int to, Map powersOfTen) { - int numDigits = to - from; - if (numDigits < RECURSION_THRESHOLD) { - return ParseDigitsTaskCharArray.parseDigitsIterative(str, from, to); - } else { - return ParseDigitsTaskCharArray.parseDigitsRecursive(str, from, to, powersOfTen); - } - } - /** - * Parses digits in exponential time O(e^n). + * Parses digits in quadratic time O(N2). */ static BigInteger parseDigitsIterative(char[] str, int from, int to) { int numDigits = to - from; @@ -69,7 +60,11 @@ static BigInteger parseDigitsIterative(char[] str, int from, int to) { } /** - * Parses digits in exponential time O(e^n). + * Parses digits in O(N log N (log log N)) time. + *

+ * A conventional recursive algorithm would require O(N1.5). + * We achieve better performance by performing multiplications of long bit sequences + * in the frequencey domain. */ static BigInteger parseDigitsRecursive(char[] str, int from, int to, Map powersOfTen) { int numDigits = to - from; diff --git a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharSequence.java b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharSequence.java index 8645a59c..11c936b5 100644 --- a/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharSequence.java +++ b/fastdoubleparser-dev/src/main/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/ParseDigitsTaskCharSequence.java @@ -37,17 +37,8 @@ private ParseDigitsTaskCharSequence() { public static final int RECURSION_THRESHOLD = 400; - static BigInteger parseDigits(CharSequence str, int from, int to, Map powersOfTen) { - int numDigits = to - from; - if (numDigits < RECURSION_THRESHOLD) { - return parseDigitsIterative(str, from, to); - } else { - return parseDigitsRecursive(str, from, to, powersOfTen); - } - } - /** - * Parses digits in exponential time O(e^n). + * Parses digits in quadratic time O(N2). */ static BigInteger parseDigitsIterative(CharSequence str, int from, int to) { int numDigits = to - from; @@ -69,7 +60,11 @@ static BigInteger parseDigitsIterative(CharSequence str, int from, int to) { } /** - * Parses digits in exponential time O(e^n). + * Parses digits in O(N log N (log log N)) time. + *

+ * A conventional recursive algorithm would require O(N1.5). + * We achieve better performance by performing multiplications of long bit sequences + * in the frequencey domain. */ static BigInteger parseDigitsRecursive(CharSequence str, int from, int to, Map powersOfTen) { // Base case: All sequences of 18 or fewer digits fit into a long. diff --git a/fastdoubleparser-dev/src/test/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JmhJavaBigIntegerFromByteArrayScalability.java b/fastdoubleparser-dev/src/test/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JmhJavaBigIntegerFromByteArrayScalability.java index 7bd7e2ae..66eeebf2 100644 --- a/fastdoubleparser-dev/src/test/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JmhJavaBigIntegerFromByteArrayScalability.java +++ b/fastdoubleparser-dev/src/test/java/ch.randelshofer.fastdoubleparser/ch/randelshofer/fastdoubleparser/JmhJavaBigIntegerFromByteArrayScalability.java @@ -30,29 +30,29 @@ * # VM version: JDK 20-ea, OpenJDK 64-Bit Server VM, 20-ea+29-2280 * # Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz * - * (digits) Mode Cnt _ Score Error Units - * dec 1 avgt 4 _ 3.895 ± 0.434 ns/op - * dec 10 avgt 4 _ 14.167 ± 6.323 ns/op - * dec 100 avgt 4 _ 423.402 ± 18.461 ns/op - * dec 1000 avgt 4 _ 4871.160 ± 234.454 ns/op - * dec 10000 avgt 4 _ 159159.396 ± 4277.295 ns/op - * dec 100000 avgt 4 _ 5095496.842 ± 128428.326 ns/op - * dec 1000000 avgt 4 _ 83316446.206 ± 2572808.486 ns/op - * dec 10000000 avgt 4 1_337268796.438 ± 122864907.274 ns/op - * dec 100000000 avgt 4 22_335176215.000 ± 967941880.037 ns/op - * dec 646456993 avgt 201_927131337.000 ns/op - * dec 1292782621 avgt 4 198_901668791.500 ± 4310147258.136 ns/op - * hex 1 avgt 4 _ 15.576 ± 0.693 ns/op - * hex 10 avgt 4 _ 27.551 ± 9.898 ns/op - * hex 100 avgt 4 _ 121.339 ± 7.258 ns/op - * hex 1000 avgt 4 _ 1043.819 ± 28.706 ns/op - * hex 10000 avgt 4 _ 10741.632 ± 258.920 ns/op - * hex 100000 avgt 4 _ 112710.224 ± 2946.230 ns/op - * hex 1000000 avgt 4 _ 1145607.433 ± 37200.668 ns/op - * hex 10000000 avgt 4 _ 12940545.545 ± 182941.922 ns/op - * hex 100000000 avgt 4 _133010989.979 ± 9262778.811 ns/op - * hex 646456993 avgt 4 _786577513.250 ± 86855481.927 ns/op - * hex 1292782621 avgt 4 _881336008.671 ± 248831282.555 ns/op + * (digits) Mode Cnt _ Score Error Units + * dec 1 avgt 4 _ 3.895 ± 0.051 ns/op + * dec 10 avgt 4 _ 13.073 ± 1.012 ns/op + * dec 100 avgt 4 _ 417.791 ± 7.421 ns/op + * dec 1000 avgt 4 _ 4713.890 ± 95.678 ns/op + * dec 10000 avgt 4 _ 159283.207 ± 5076.778 ns/op + * dec 100000 avgt 4 _ 5148743.967 ± 269250.312 ns/op + * dec 1000000 avgt 4 _ 82326733.043 ± 988263.342 ns/op + * dec 10000000 avgt 4 1_359363201.768 ± 306440228.295 ns/op + * dec 100000000 avgt 4 22_241605723.500 ± 1486815357.018 ns/op + * dec 646456993 avgt 4 202_298930337.000 ± 10730148276.875 ns/op + * dec 1292782621 avgt 4 197_858301116.000 ± 5372926664.595 ns/op + * hex 1 avgt 4 _ 15.576 ± 0.693 ns/op + * hex 10 avgt 4 _ 27.551 ± 9.898 ns/op + * hex 100 avgt 4 _ 121.339 ± 7.258 ns/op + * hex 1000 avgt 4 _ 1043.819 ± 28.706 ns/op + * hex 10000 avgt 4 _ 10741.632 ± 258.920 ns/op + * hex 100000 avgt 4 _ 112710.224 ± 2946.230 ns/op + * hex 1000000 avgt 4 _ 1145607.433 ± 37200.668 ns/op + * hex 10000000 avgt 4 _ 12940545.545 ± 182941.922 ns/op + * hex 100000000 avgt 4 _133010989.979 ± 9262778.811 ns/op + * hex 646456993 avgt 4 _786577513.250 ± 86855481.927 ns/op + * hex 1292782621 avgt 4 _881336008.671 ± 248831282.555 ns/op * */ @Fork(value = 1, jvmArgsAppend = { @@ -108,7 +108,6 @@ public void setUp() { hexLiteral = str.getBytes(StandardCharsets.ISO_8859_1); } - @Benchmark public BigInteger hex() { return JavaBigIntegerParser.parseBigInteger(hexLiteral, 16);