Infinite loop of MediaSoup errors #2169

jpd236 · 2024-06-07T03:55:18Z

My JR instance has been up for ~3 weeks.

Starting on 6/4 (two days after #2167 happened, with nothing else in between in the server logs), the logs started showing a repeated sequence like this:

2024-06-04T02:40:07.127Z mediasoup:ERROR:Channel [pid:54 ChannelMessageRegistrator::RegisterHandler() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists
2024-06-04T02:40:07.170Z mediasoup:ERROR:Channel [pid:54 Worker::HandleRequest() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:40:07.178Z mediasoup:WARN:Channel request failed [method:transport.produceData, id:53]: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:40:51.254Z mediasoup:ERROR:Channel [pid:54 ChannelMessageRegistrator::RegisterHandler() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists
2024-06-04T02:40:51.254Z mediasoup:ERROR:Channel [pid:54 Worker::HandleRequest() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:40:51.255Z mediasoup:WARN:Channel request failed [method:transport.produceData, id:71]: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:41:20.346Z mediasoup:ERROR:Channel [pid:54 ChannelMessageRegistrator::RegisterHandler() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists
2024-06-04T02:41:20.348Z mediasoup:ERROR:Channel [pid:54 Worker::HandleRequest() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:41:20.349Z mediasoup:WARN:Channel request failed [method:transport.produceData, id:88]: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]
2024-06-04T02:41:48.808Z mediasoup:ERROR:Channel [pid:54 ChannelMessageRegistrator::RegisterHandler() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists
2024-06-04T02:41:48.809Z mediasoup:ERROR:Channel [pid:54 Worker::HandleRequest() | throwing MediaSoupError: Channel request handler with ID c93bdf42-c7e6-4f63-b288-f8d9546165d2 already exists [method:transport.produceData]

The IDs in the messages come and go, but each one appears tens of thousands of times.

Possibly unrelated, but this morning I got a flurry of MongoDB connection errors (MongoPoolClearedError, PoolClearedOnNetworkError, MongoServerSelectionError), and when I tried loading the site a little later it failed the first try and then was quite slow to come up, though it since recovered AFAICT aside from the log spam.

I pushed a new docker image to trigger an update from HEAD (with whatever has landed in Git in the last couple of weeks) + restart and that seems to have stopped the log spam for the moment.

The text was updated successfully, but these errors were encountered:

jpd236 · 2024-09-28T05:08:46Z

Just happened again (found out because my Papertrail account started approaching quota all of a sudden due to the log spam). Settled down again with just a regular restart.

jpd236 · 2024-12-05T02:54:16Z

Someone just reported that they were having trouble staying connected to audio calls, and I checked the server logs and it had entered this state of spitting out "already exists" errors. They said:

We were all getting hold messages, and then it claimed there were 2-3 versions of each of us in the call while still not having any audio

jpd236 · 2025-01-03T19:45:36Z

Grasping at straws a bit here, but I wonder if downgrading to mediasoup 3.11.2 would work around this. versatica/mediasoup#952, which was part of mediasoup 3.11.3, was pulled into JR in Nov 2022 in #1131. I'd have thus expected this issue to be observed by JR users in 2023/2024, but maybe somehow our team is triggering a unique scenario that hasn't been triggered before.

In the above error, ChannelMessageRegistrator::RegisterHandler is throwing because a channel already exists. Prior to that PR, if this situation occurs, registerHandler would catch the error and unregister the existing handler. While that seems like a good change in general to make, if there is a bug that causes JR to erroneously try to reuse a channel ID when it shouldn't, it could conceivably turn a one-off error that causes the existing channel to drop into an infinite loop where it repeatedly tries and fails.

So the root question is still why we're getting in this state where the channel already exists unexpectedly and isn't being reused properly, but perhaps downgrading will mitigate the impact of the bug in the near-term, if that version isn't too old/incompatible.

This is an attempt at working around deathandmayhem#2169. The bug indicates loops of tens of thousands of ChannelMessageRegistrator::RegisterHandler failures all for the same ID due to the handler already existing. In 3.11.2, the first failure would cause the existing handler to be unregistered, which would presumably make the second attempt succeed. In 3.11.3, it the existing handler would be left around, which could certainly explain a loop of repeated errors as the state isn't changing. It seems likely that there is still another latent bug causing handlers to be abandoned instead of being unregistered properly, but perhaps going back to the old behavior is a better compromise until this can be fixed properly. See deathandmayhem#2169

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite loop of MediaSoup errors #2169

Infinite loop of MediaSoup errors #2169

jpd236 commented Jun 7, 2024

jpd236 commented Sep 28, 2024

jpd236 commented Dec 5, 2024

jpd236 commented Jan 3, 2025

Infinite loop of MediaSoup errors #2169

Infinite loop of MediaSoup errors #2169

Comments

jpd236 commented Jun 7, 2024

jpd236 commented Sep 28, 2024

jpd236 commented Dec 5, 2024

jpd236 commented Jan 3, 2025