Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OOM crash diagnostics #214

Open
sfackler opened this issue Dec 29, 2024 · 0 comments
Open

Add OOM crash diagnostics #214

sfackler opened this issue Dec 29, 2024 · 0 comments

Comments

@sfackler
Copy link
Member

sfackler commented Dec 29, 2024

If a server gets OOM killed, there isn't currently a great way to debug it remotely.

While we can't do something like Java's HeapDumpOnOutOfMemoryError, we can probably enable jemalloc's prof.gdump to generate heap profiles every time memory usage hits a new high. We won't see the exact cause of the OOM kill, but we should get close as long as the OOM wasn't caused by a single huge allocation. We could then make a diagnostic that returns the previous process's last dump.

We'd need to do some work to manage the files it generates so they don't accumulate too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant