-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some komodo_...() functions needs to be wrapped by cs_main lock #482
Comments
An option to consider: move the functionality of those methods to within the CChain class and do the locking there. Sometimes this can centralize the mutex to only be needed within the class, easing maintenance, and removing the need for the assertion. |
I thought about that, but checking the code I think it is not as simple as that: sometimes several different calls need to be wrapped by the same lock object, so it is only the caller who knows which code to lock. |
With little research, I can (and will) do a PR to add the necessary What I've found:
Synchronization is difficult, and often implemented wrong. There are ways to ease the burden, but such strategies require understanding and compliance by all who implement and use it. A quick search led me to this example. Granted, it seems to only be used by Verus, but it will illustrate what I believe to be an issue... Line 1212 in 56590a4
The method Moving the lock to the beginning of the method is one option. But doing this everywhere would probably lead to the application performing worse than if it were designed to be single-threaded. Another possible solution is for chainActive to provide a method to retrieve the requested CBlockIndex and use an internal mutex to protect any operation on the vector vChain (vChain is already private, so no external accessors). But that often requires two sets of methods. One to do a simple "get me the CBlockIndex" and another for "get me the CBlockIndex and don't let anyone else do anything with the vector until I'm done". Recursive mutexes could be used instead, but that comes with a performance hit. So the short answer is to fix these things when we find them, I understand that. The long answer is to rework the purpose of cs_main (i.e. make it responsible for less) before performance takes a hit, or programmers who don't know better (myself included) write new code that causes this GitHub issue to repeat ad infintum. |
I made a quick template to brute-force check for other non-lock situations at runtime using AssertLockHeld. You can see it at 7d7eabb Just swap out the CChain in main.cpp and main.h as was done in that commit. Running komodod gives me
And for those curious, here is the backtrace:
|
I agree with the most of above and about get_chainactive in specific.
I will add your fix for get_chainactive as well. |
I added the protections in commit jmjatlanta@186661c There are a few more TODO items for PR #485 which I hope to resolve in short order. I agree there are surely many other areas where locks are missed. We'll create issues/PRs when we find them. |
probably we could think of using the shared lock on reading and exclusive lock on writing. Wondering why this was not done in the original code |
If we could move to C++17 (preferred IMO, just not sure clients would agree), read/write locks are part of the standard. If not, such things have been available in boost since 1.35. As I am sure you are aware, such mechanisms carry overhead. But for lock contention issues (which can often be the case for frequent reads and occasional writes), there is sometimes a performance gain. I would imagine we do a lot more reading than writing to things like the block index, so I'd say your idea is a good one at least in some areas. With so many references to
|
afaik, locks have only ever historically been used "sparingly" and added as necessity dictated rather than as a matter of style and practice |
yes, however komodo introduced a NSPV protocol which implies many client connections to nspv nodes doing multiple read requests (in comparison to ordinary nodes) so we should think of adjusting the locking strategy |
While I do think it's something to look at, my main point was to answer the "question" of "why". |
many missing locks were added in #559 |
Several functions komodo_chainactive komodo_heightstamp komodo_chainactive_timestamp access the chainActive object that needs to be protected by cs_main critical section.
It looks like calls to these functions are not surrounded with LOC(cs_main) when required. At least the nspv server part implementation may call those functions without cs_main lock.
That needs to be added.
Also it is suggested to add AssertLockHeld(cs_main) into those functions to track locks (this is activated by macro DEBUG_LOCKORDER)
The text was updated successfully, but these errors were encountered: