-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop of MediaSoup errors #2169
Comments
Just happened again (found out because my Papertrail account started approaching quota all of a sudden due to the log spam). Settled down again with just a regular restart. |
Someone just reported that they were having trouble staying connected to audio calls, and I checked the server logs and it had entered this state of spitting out "already exists" errors. They said:
|
Grasping at straws a bit here, but I wonder if downgrading to mediasoup 3.11.2 would work around this. versatica/mediasoup#952, which was part of mediasoup 3.11.3, was pulled into JR in Nov 2022 in #1131. I'd have thus expected this issue to be observed by JR users in 2023/2024, but maybe somehow our team is triggering a unique scenario that hasn't been triggered before. In the above error, So the root question is still why we're getting in this state where the channel already exists unexpectedly and isn't being reused properly, but perhaps downgrading will mitigate the impact of the bug in the near-term, if that version isn't too old/incompatible. |
This is an attempt at working around deathandmayhem#2169. The bug indicates loops of tens of thousands of ChannelMessageRegistrator::RegisterHandler failures all for the same ID due to the handler already existing. In 3.11.2, the first failure would cause the existing handler to be unregistered, which would presumably make the second attempt succeed. In 3.11.3, it the existing handler would be left around, which could certainly explain a loop of repeated errors as the state isn't changing. It seems likely that there is still another latent bug causing handlers to be abandoned instead of being unregistered properly, but perhaps going back to the old behavior is a better compromise until this can be fixed properly. See deathandmayhem#2169
This is an attempt at working around deathandmayhem#2169. The bug indicates loops of tens of thousands of ChannelMessageRegistrator::RegisterHandler failures all for the same ID due to the handler already existing. In 3.11.2, the first failure would cause the existing handler to be unregistered, which would presumably make the second attempt succeed. In 3.11.3, it the existing handler would be left around, which could certainly explain a loop of repeated errors as the state isn't changing. It seems likely that there is still another latent bug causing handlers to be abandoned instead of being unregistered properly, but perhaps going back to the old behavior is a better compromise until this can be fixed properly. See deathandmayhem#2169
My JR instance has been up for ~3 weeks.
Starting on 6/4 (two days after #2167 happened, with nothing else in between in the server logs), the logs started showing a repeated sequence like this:
The IDs in the messages come and go, but each one appears tens of thousands of times.
Possibly unrelated, but this morning I got a flurry of MongoDB connection errors (MongoPoolClearedError, PoolClearedOnNetworkError, MongoServerSelectionError), and when I tried loading the site a little later it failed the first try and then was quite slow to come up, though it since recovered AFAICT aside from the log spam.
I pushed a new docker image to trigger an update from HEAD (with whatever has landed in Git in the last couple of weeks) + restart and that seems to have stopped the log spam for the moment.
The text was updated successfully, but these errors were encountered: