-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My server takes up a lot of CPU and RAM after a database restore #17138
Comments
Did you update along with restore? |
Nope. It was up to date before the restore |
But yea, if there is a way to fix it quickly, I won't say no. The ressources are so saturated, it's impossible to use it (the server is supposed to be online) |
I updated Synapse (now I'm in 1.106). The problem isn't fixed |
@Cocam123 try rolling back to v1.104.0 or whatever version you were using before. For me database performance issues started with v1.105.0. |
how do I do that? |
I installed Synapse with apt install matrix-synapse-py3 |
okay I found the package and seems like it doesn't work |
Did you possibly restore the backup multiple times, e.g. partially restore it once, run into an error or interrupt it manually, then restore it again without clearing the database that you (partially) restored to the first time? edit: I should provide the context, I say this because this is known to cause problems as it can lead to duplicate rows. matrix-org/synapse#11779 was a previous example though looks like that particular case got patched. |
Another suggestion is to |
I completely reinstalled everything (after a reset). The server is working but it's extremely slow. The CPU is overloaded and it takes the storage (when I restart synapse, it frees up storage) |
If you definitely restored your database from fresh in one attempt then that clears the first point at least :) Did you try to I know it's a faff but using the Prometheus metrics + Grafana dashboard can help to have a little bit more idea of where the time is going. As mentioned earlier, there is a suspected (or is it safe to say 'known'?) performance regression in this version (#17129), but if you were already running that version before then I don't see why that would be the problem. |
hey! okay I installed Prometheus and Synapse. What information might be of interest? I see federation but I do not know what to send (that could cause the problem except that) |
I've also run a vacuum analysis on the entire database, but it doesn't give anything |
it found deadrows but despite the analysis and a vacuum full verbose it didn't improve the server's situation |
Thanks for your graphs! I notice that 'Age of oldest event in staging area' is high (2+ days) and that you have ~500 events in the staging area and this number seems to not be decreasing. This probably means that your server is struggling to persist the events. Since the CPU usage doesn't look very high, I guess something is going slowly in the database? (Is Postgres the source of your high CPU use on your server?) If you open up the 'Database' section in the Grafana dashboard, that probably has some interesting info, if you don't mind? |
Description
Since I restored a backup of my database, I've been unable to connect to synapse matrix. It simply takes up all the RAM and CPU available on the server.
I've talked about it on the matrix channel, but we haven't managed to solve the problem.
I tried REINDEX and VACUUM FULL. I also disabled presence and changed some federation parameters in homeserver.yalm :
federation:
destination_min_retry_interval: 1m
destination_retry_multiplier: 5
destination_max_retry_interval: 365d
but after reboot, the same problems occur
Steps to reproduce
Homeserver
matrix.cocamserverguild.com
Synapse Version
1.105.1
Installation Method
Debian packages from packages.matrix.org
Database
PostgreSQL (single one, restored from a backup)
Workers
Single process
Platform
It's running on a Debian machine on a VPS.
2 CPU, 4 Go RAM
Configuration
No response
Relevant log output
Anything else that would be useful to know?
No response
The text was updated successfully, but these errors were encountered: