Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abnormally high memory usage in mainframe on staging #311

Open
jonathan-d-zhang opened this issue Aug 19, 2024 · 4 comments
Open

Abnormally high memory usage in mainframe on staging #311

jonathan-d-zhang opened this issue Aug 19, 2024 · 4 comments
Assignees

Comments

@jonathan-d-zhang
Copy link
Contributor

Prod mainframe uses ~100MiB while staging uses ~250MiB. I suspect this is due to SQLAlchemy caching the distributions field that was recently added.

@jonathan-d-zhang
Copy link
Contributor Author

jonathan-d-zhang commented Aug 22, 2024

SQLAlchemy's identity map would be collected as soon as the session was closed, so I don't think my original guess is correct. I also found https://docs.pydantic.dev/latest/concepts/json/#caching-strings, but by quick estimation a completely full cache with 63 character strings would be at most 2 MiB.

Also, when attempting to increase the memory usage by requesting a large time span with GET /package?since=<30 minutes ago> (to simulate the situation described in vipyrsec/bot#255), the memory usage jumped up to ~350MiB, but did not go down even after a few minutes. This leads me to believe that mainframe is somehow holding onto the memory used to build the response. I'm using memray to test.

@Robin5605
Copy link
Contributor

Also, when attempting to increase the memory usage by requesting a large time span with GET /package?since=<30 minutes ago> (to simulate the situation described in vipyrsec/bot#255), the memory usage jumped up to ~350MiB, but did not go down even after a few minutes. This leads me to believe that mainframe is somehow holding onto the memory used to build the response. I'm using memray to test.

After some investigation, it doesn't seem like this issue is necessarily introduced by this PR. I've run memray against the main branch, then requested GET /package?since=<3 days ago> for a very large response body, and saw this memory usage graph:
image

A good chunk of memory is still being hung onto

@Robin5605
Copy link
Contributor

image

This is about 91MB of memory being retained a while after the response was complete. Unsure why it's being held onto.

@Robin5605
Copy link
Contributor

I believe the culprit here is Pydantic.
I've tried to use msgspec. the benchmarking section on JSON Serialization - Large Data may be of particular interest to us. I ran a test by first sending GET /package with a since 20 days ago (~34MB JSON response). this replicates what you said in the issue comment with memory usage of 350MiB that doesn't go down. I then swapped out the Pydantic model for a msgpac struct. the most obvious difference was the time taken to serialize went down from 969ms on average (with Pydantic) to 197ms on average (with msgspec) (almost a 5x speedup). the memory usage spiked from (104 MiB, the base) to 200 MiB (not sure what to do about this tbh, same thing with pydantic) but the key thing here is it goes back down to 108 MiB, which is significantly better than spiking to over 300MiB and staying

because of this i am putting switching to msgspec on the table, maybe we can do a trial period where we test it out on staging and see how it performs while monitoring memory usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

No branches or pull requests

2 participants