Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes #555 and provides a regression test that minimally recreates the deadlocking issue.
If you checkout
Rx/v2/test/CMakeLists.txt
andRx/v2/test/subscriptions/race_condition.cpp
into the currentmaster
branch and try to runrxcpp_test_race_condition
, you will most likely find that the test gets permanently deadlocked. The deadlocking result is not guaranteed, since it depends on a race condition that has a very narrow window, but it's very probable that the race condition will occur (at least on my machine, it has been). If it doesn't deadlock on the first try, then the test can be rerun repeatedly until it eventually does get deadlocked.But if you run the test on this branch, with the changes to
multicast_observer
, you will find that it never gets deadlocked no matter how many times the test is repeated. The deadlock is fixed by simply changing the mutex ofmulticast_observer::state_type
to astd::recursive_mutex
which guarantees that it is safe to recursively lock it (which is what causes the deadlock when using a plainstd::mutex
).I understand that some people consider
std::recursive_mutex
to be something that should always be avoided, because it may be indicative of a design flaw in the memory access strategy. But after putting a lot of thought into this specific deadlocking issue, I don't see an alternative fix for this without a dramatic redesign ofmulticast_observer
.While this deadlocking is a very rare occurrence, it may be an extremely impactful one. Whenever this occurs, it permanently locks up an entire thread. In my own use case, I have a design pattern where I assign an entire category of operations to be observed on a specific
rxcpp::schedulers::worker
(essentially using the worker as a bottleneck to ensure mutually exclusive access to a certain cluster of data), and if that worker ever gets caught in this deadlock, then I'll permanently lose the ability to perform that entire category of operations.So while it might be desirable to do a redesign to avoid the need for a
std::recursive_mutex
I would strongly encourage using it as a stopgap until an alternative solution can be found and implemented, assuming that might take some time to work out.