Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ocaml5-issue] Segfault or hang on macOS in STM Gc stress test parallel #480

Open
jmid opened this issue Sep 27, 2024 · 2 comments
Open
Labels
ocaml5-issue A potential issue in the OCaml5 compiler/runtime

Comments

@jmid
Copy link
Collaborator

jmid commented Sep 27, 2024

The upcoming Gc test in #469 has surfaced a macOS segfault while running STM Gc stress test parallel.

This first happened on 5.3.0+trunk:
https://github.com/ocaml-multicore/multicoretests/actions/runs/11010731348/job/30573309071

random seed: 464277843
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential (generating)
[✗]    9    0    1    8 / 1000     0.0s STM Gc test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential in child domain
[✗]    9    0    1    8 / 1000     0.0s STM Gc test sequential in child domain

[ ]    0    0    0    0 / 1000     0.0s STM Gc test parallel
[✓]    2    0    1    1 / 1000     8.9s STM Gc test parallel

File "src/gc/dune", line 4, characters 7-16:
4 |  (name stm_tests)
           ^^^^^^^^^
(cd _build/default/src/gc && ./stm_tests.exe --verbose)
Command got signal SEGV.
[ ]    0    0    0    0 / 1000     0.0s STM Gc stress test parallel

and then on 5.2.0:
https://github.com/ocaml-multicore/multicoretests/actions/runs/11037096487/job/30657143459?pr=469

random seed: 377633884
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential (generating)
[✓] 1000    0    0 1000 / 1000     0.2s STM Gc test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential in child domain
[✓] 1000    0    0 1000 / 1000     3.3s STM Gc test sequential in child domain

[ ]    0    0    0    0 / 1000     0.0s STM Gc test parallel
[✓]    1    0    1    0 / 1000    25.9s STM Gc test parallel

[ ]    0    0    0    0 / 1000     0.0s STM Gc stress test parallel
File "src/gc/dune", line 13, characters 7-16:
13 |  (name stm_tests)
            ^^^^^^^^^
(cd _build/default/src/gc && ./stm_tests.exe --verbose)
Command got signal SEGV.
@jmid jmid added the ocaml5-issue A potential issue in the OCaml5 compiler/runtime label Sep 27, 2024
@jmid jmid mentioned this issue Sep 27, 2024
@jmid
Copy link
Collaborator Author

jmid commented Oct 23, 2024

The latest CI run also triggered a Gc test crash on 5.2.0 under macOS with an Intel/amd64 CPU:
https://github.com/ocaml-multicore/multicoretests/actions/runs/11479668195/job/31946444432

random seed: 302956495
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential (generating)
[✓] 1000    0    0 1000 / 1000     0.4s STM Gc test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential in child domain
[✓] 1000    0    0 1000 / 1000     0.9s STM Gc test sequential in child domain

[ ]    0    0    0    0 / 1000     0.0s STM Gc test parallel
[✓]   21    0    1   20 / 1000     7.4s STM Gc test parallel

File "src/gc/dune", line 13, characters 7-16:
13 |  (name stm_tests)
            ^^^^^^^^^
(cd _build/default/src/gc && ./stm_tests.exe --verbose)
Command got signal BUS.
[ ]    0    0    0    0 / 1000     0.0s STM Gc stress test parallel

@jmid jmid changed the title [ocaml5-issue] Segfault on macOS ARM64 in STM Gc stress test parallel [ocaml5-issue] Segfault on macOS in STM Gc stress test parallel Oct 23, 2024
@jmid
Copy link
Collaborator Author

jmid commented Oct 28, 2024

Realized that this issue may also cause hangs / infinite loops.
Here's a fresh case on macOS-ARM64 testing trunk where the 18th repetition has taken ~1 hour to complete,
whereas the previous 17 repetitions each completed in under 1min each:
https://github.com/ocaml-multicore/multicoretests/actions/runs/11555386874/job/32160677095

Starting 18-th run
Page size: 16384

random seed: 253047476
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential (generating)
[✓] 1000    0    0 1000 / 1000     0.2s STM Gc test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Gc test sequential in child domain
[✓] 1000    0    0 1000 / 1000     0.5s STM Gc test sequential in child domain

[ ]    0    0    0    0 / 1000     0.0s STM Gc test parallel
[✓]    2    0    1    1 / 1000     8.9s STM Gc test parallel

[hang]

@jmid jmid changed the title [ocaml5-issue] Segfault on macOS in STM Gc stress test parallel [ocaml5-issue] Segfault or hang on macOS in STM Gc stress test parallel Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ocaml5-issue A potential issue in the OCaml5 compiler/runtime
Projects
None yet
Development

No branches or pull requests

1 participant