Function producing different outputs for same inputs from different paths #1007

Marie-Joseph · 2024-10-23T00:29:09Z

Hello,

I have recently implemented SHA-1 and SHA-2 in R7RS Scheme. In the process of implementing the 64-bit variants of SHA-2 (the SHA-512 family), I implemented SHA-512/t, an algorithm which generates new seed values according to the process described in FIPS 180-4 5.3.6, uses them to run SHA-512 on a given input, then truncates the result to a specific size (t). I implemented tests for this procedure by comparing its output when passed t = 256 and t = 224 against the standalone variants for those sizes. Most of these tests pass just fine, but one does not -- and before a recent change to use bit-field and macros for slight runtime performance increases (bit-field is about 4 times faster for truncation than bitwise-and), both of the "multiple blocks" tests failed.

As you can verify yourself in the relevant code, and see more clearly by running make in the debug branch, the failing test ultimately calls sha-2-64 twice with identical inputs, but gets a different output. Specifically, the third iteration of the compression pass of the second block is different. Even more specifically, the bitwise-only check function produces 0 for sha-512/t but an actual value for sha-512/256 despite receiving the exact same inputs for both. In the printed debug output, this would be t= 2: ... following the message block that starts with '0's, the first and third instances of t= 2: up from the bottom of the output.

Note that when reading the debug output, there's a block between sha-512/256 and sha-512/t called with 256 -- this is the calculation of the seeds as described in the aforementioned part of FIPS 180-4 and can be safely ignored.

I suspect this may be related to the stack or bignums.

Thanks,
Juli

The text was updated successfully, but these errors were encountered:

Marie-Joseph · 2024-10-23T00:32:03Z

Actually, for posterity, here's the mentioned printed output with some additional debug printing. On the left is sha-512/t and on the right is sha-512/256.

The order in which the functions is called in testing does not change the output.

ashinn · 2024-10-23T09:45:48Z

(That was confusing, I at first read the name as "Juliana Shinn" and wondered when my sister had started programming...)

Thank you for the detailed report! This looks like either a bug in the bitwise ops on bignums, or possibly some general heap corruption (although that seems less likely, especially if the test is repeatable).

If you can narrow down the specific bitwise function and inputs where the values diverge that would help a lot, otherwise I'll take a look when I find time.

ashinn · 2024-10-23T14:00:42Z

By the way, are you using 0.11 or HEAD?

Marie-Joseph · 2024-10-24T03:31:07Z

Thank you for the detailed report! This looks like either a bug in the bitwise ops on bignums, or possibly some general heap corruption (although that seems less likely, especially if the test is repeatable).

If you can narrow down the specific bitwise function and inputs where the values diverge that would help a lot, otherwise I'll take a look when I find time.

Well, it seems that this error is not, in fact, repeatable -- not the next day, anyway. I'm not sure what happened, but the test is passing fine now.

By the way, are you using 0.11 or HEAD?

0.11

EDIT: the bug is back! just a moment

Marie-Joseph · 2024-10-24T03:59:22Z

The specific incorrect computation is (bitwise-ior 4756225156975034538 11529219719034833984).

The computation unfolds thusly:

(define (ch x y z)
  (bitwise-ior
    (bitwise-and x y)
    (bitwise-and (bitwise-not x) z)))

(ch 5633329806339096767 4828357668228770026 12548180113516416124)
->
(bitwise-ior
  (bitwise-and 5633329806339096767 4828357668228770026)
  (bitwise-and (bitwise-not 5633329806339096767) 12548180113516416124))
->
(bitwise-ior
  (bitwise-and 5633329806339096767 4828357668228770026)
  (bitwise-and -5633329806339096768 12548180113516416124))
->
(bitwise-ior
  4756225156975034538
  11529219719034833984)

At this point, the correct route produces 16285444876009868522 and the incorrect route produces 0.

ashinn · 2024-10-24T09:44:29Z

(bitwise-ior
  4756225156975034538
  11529219719034833984)

consistently produces the correct result for me. Probably a GC bug in one of the C coded bitwise operators, let me audit the code.

ashinn · 2024-10-28T01:52:51Z

I cloned your repo and was able to reproduce the failed test. For simplicity I commented out all the other tests and changed to use (chibi test):

$ chibi -Isrc tests/test-sha-2.scm 
test-sha-512_t: 
 x
0 out of 1 (0%) tests passed in 0.04399609565734863 seconds.
1 failure (100.0%).
FAIL: (sha-512/t (string->utf8 str-3) 256)
    expected 25854062729967050020067329891762006855908633897400075620255178771425459394106 but got 68313699301727291834076519907935432851858577852930143704247139022284457335091
    on line 367 of file "tests/test-sha-2.scm"

I was unable to reproduce a non-deterministic bitwise-ior call, however. I checked a few ways, but it's worth illustrating how to do this using (chibi trace):

chibi-scheme -Isrc -tsrfi.151.bitwise-ior tests/test-sha-2.scm 2>&1 >/dev/null `# redirect stderr to stdout` |\
  grep -A1 '^>'   `# extract only the traced commands and results (the next line)` |\
  paste - -       `# join the result onto the same line` |\
  sed 's/^> //'   `# strip the prompt` |\
  awk -F'\t' 'BEGIN{OFS=FS} {print $2, $1}'   `# swap to (result, application) order` |\
  sort -u         `# remove duplicate lines` |\
  uniq -D --skip-fields=1                     `# print any duplicate calls, ignoring the result`

I then repeated this check with all of the other SRFI 151 procedures, your (common) library procedures, and the basic arithmetic operators, and could not find any source of non-determinism.

Marie-Joseph changed the title ~~Unexpected non-determinism~~ Function producing different outputs for same inputs from different paths Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function producing different outputs for same inputs from different paths #1007

Function producing different outputs for same inputs from different paths #1007

Marie-Joseph commented Oct 23, 2024 •

edited

Loading

Marie-Joseph commented Oct 23, 2024 •

edited

Loading

ashinn commented Oct 23, 2024

ashinn commented Oct 23, 2024

Marie-Joseph commented Oct 24, 2024 •

edited

Loading

Marie-Joseph commented Oct 24, 2024

ashinn commented Oct 24, 2024

ashinn commented Oct 28, 2024 •

edited

Loading

Function producing different outputs for same inputs from different paths #1007

Function producing different outputs for same inputs from different paths #1007

Comments

Marie-Joseph commented Oct 23, 2024 • edited Loading

Marie-Joseph commented Oct 23, 2024 • edited Loading

ashinn commented Oct 23, 2024

ashinn commented Oct 23, 2024

Marie-Joseph commented Oct 24, 2024 • edited Loading

Marie-Joseph commented Oct 24, 2024

ashinn commented Oct 24, 2024

ashinn commented Oct 28, 2024 • edited Loading

Marie-Joseph commented Oct 23, 2024 •

edited

Loading

Marie-Joseph commented Oct 23, 2024 •

edited

Loading

Marie-Joseph commented Oct 24, 2024 •

edited

Loading

ashinn commented Oct 28, 2024 •

edited

Loading