Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node V20.15.0 Crash on Process Exiting #56245

Open
kekee000 opened this issue Dec 13, 2024 · 2 comments
Open

Node V20.15.0 Crash on Process Exiting #56245

kekee000 opened this issue Dec 13, 2024 · 2 comments

Comments

@kekee000
Copy link

kekee000 commented Dec 13, 2024

Version

20.15.0

Platform

Linux 5.10.0-1.0.0.26 #1 SMP Thu Apr 20 06:40:16 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

CentOS release 4.3 (Final)

What steps will reproduce the bug?

We have a large Node.js cluster running over 1000 Node processes, each has 10 worker_threads handling SSR rendering tasks in parallel, one process has 8 cores to use.

Within these threads, there are some native binding modules, including NAN and N-API modules. These modules are likely has no problem, because a larger cluster has no crashes under Node 14.

When we deploy new codes, we stop the old Node services and restart them.
During the shutdown process, there is a certain probability that the process crash and generate a core dump. Below is the core dump information:

(lldb) target create "node_env_20/bin/node" --core "/home/coredumps/core.node.197815.20241223164825"
Core file '/home/coredumps/core.node.197815.20241223164825' (x86_64) was loaded.
(lldb) plugin load "/home/work/search/node_env/tools/bin/../lib/llnode.so"
(lldb) settings set prompt '(llnode) '
(llnode) bt
* thread #1, name = 'node_ui_child', stop reason = signal SIGSEGV
  * frame #0: 0x00000000010be6a8 node`v8::internal::CppMarkingState::MarkAndPush(v8::internal::EmbedderDataSlot, v8::internal::EmbedderDataSlot) + 56
(llnode) quit

The probability of generating a core dump is approximately 10 crashes per restart.
Each core dump is related to CppMarkingState::MarkAndPush and only happens in x64 arch, in aarch64 arch has no core issue.

Since our app need to be deployed multiple times a day, too much cores will trigger deploy warnings, we have to fix this core problems.

I checked ResetCreateHistogramFunction in Node V8 engine and found that this function is not called in the release of Node. It is only invoked in the d8 and debug mode.

Could you please give me some advises on this issue, and solve this crash problem?

Thanks.

How often does it reproduce? Is there a required condition?

Node20 in x64 arch and worker_threads, when process exiting.

What is the expected behavior? Why is that the expected behavior?

Expect no cores on process existing.

What do you see instead?

Process crashed randomly, and 1% process crashes on existing during one deploy.

Additional information

No response

@bnoordhuis
Copy link
Member

Can you post the full backtrace?

@kekee000
Copy link
Author

kekee000 commented Dec 16, 2024

This is full backgrace of Node20 all core files :

* thread #1, name = 'node_ui_exiting', stop reason = signal SIGSEGV
  * frame #0: 0x00000000010be6a8 node`v8::internal::CppMarkingState::MarkAndPush(v8::internal::EmbedderDataSlot, v8::internal::EmbedderDataSlot) + 56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants