-
-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: config write guard #2360
base: master
Are you sure you want to change the base?
PoC: config write guard #2360
Conversation
@schlimmchen I have not looked at the changes in detail yet, but it's definitely an improvement. There are more synchronization issue between the WebApi and the main loop though, like here (if I'm not mistaken): OpenDTU/src/WebApi_inverter.cpp Line 144 in dc5eb96
|
Looks great! I am just curious if we would ever have to write the config in the context of the main loop (as it will generate a dead lock). Currently I cannot imagine any use case. |
Thanks for your feedback!
Yes. I saw that, too. If one would inadvertently try to do this, the deadlock would happend when testing the change for the first time, and the config would not be saved at all, so it should be noticed early. And if we need to do that, we can adjust the WriteGuard c'tor to see whether the current task it the main loop task and then only lock the mutex, not waiting for the signal. Let me polish and double-check this, then I will un-draft this PR (next week). |
@tbnobody are you passionate about
versus
or whether it's consistent across the code base? |
the configuration write guard is now required when the configuration struct shall be mutated. the write guards locks multiple writers against each other and also, more importantly, makes the writes synchronous to the main loop. all code running in the main loop can now be sure that (1) reads from the configuration struct are non-preemtive and (2) the configuration struct as a whole is in a consistent state when reading from it. NOTE that acquiring a write guard from within the main loop's task will immediately cause a deadlock and the watchdog will trigger a reset. if writing from inside the main loop should ever become necessary, the write guard must be updated to only lock the mutex but not wait for a signal.
9f4f4fa
to
b1edb13
Compare
I am still happy with this. I double checked that generating a DTU serial works as expected using an erased ESP32:
|
bool read(); | ||
bool write(); | ||
void migrate(); | ||
CONFIG_T& get(); | ||
CONFIG_T const& get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this getter could become problematic: how it ensures that reads going through it will be protected by the mutex ?
What if another task / core modifies the config while another one calls this method to access a value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, there is no assurance. I did not want to create overhead when reading the config structure. we assume the config structure is only read in context of the main loop. since writing is synchronized with the main loop using the changes in this PR, there is no issue when reading the structure in context of the main loop. however, other tasks also do read the structure, and indeed, those reads are still not protected through the changes in this PR. IMHO, that's a next step. it's going to be a pain to address this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's not complicated: the config class just has to expose the mutex (or 2 methods) and where you read the config in the code, you lock the mutex before then unlock it.
The ESP-IDF and Arduino Core are full of designs like that, especially in the LWIP layer which now require to lock/unlock all operations since Matter was introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not write it was complicated, just a pain 😉 What I meant was: large changeset. If we want to use a lock guard as well, we also need an additional scope, meaning indenting code, meaning even larger changeset. And things do indeed get complicated, or at least nasty, if the mutex must be released before a function is called, as to avoid recursive locking. I assume recursive locks are to be avoided, especially since we also use other mutexes and deadlocks are looming once we need to juggle multiple locks simultaneously.
The ESP-IDF and Arduino Core are full of designs like that
You are suggesting that we should not be afraid of the overhead of locking the same mutex dozens of times in the main loop in every round of the loop? Usually, I am very liberal when it comes to trading some performance for new and useful features, but locking mutexes? Am I paranoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree!
All depends on who reads, when and at which frequency.
I don't know enough the app internally.
Maybe otherwise returning copies or partial copies of the config when reading would be an option also
I was just highlighting the fact that there is no protection theorically.
But if in practice it never happens then it's good.
I was trying to tackle application of changed Dynamic Power Limiter settings in my feature branch implementing support for multiple inverters managed by the DPL. Then I noticed, that a bunch of nasty problem would go away if I knew that I could trust the config being consistent, i.e., if the config was only written synchronously with the main loop.
The latter ist currently not true: The config is usually manipulated in the context of the async webserver task, while it is read in the context of the main loop.
I wanted to keep the overhead when reading the config to a minimum, so locking a mutex each time the config shall be read is out of the picture, IMHO.
Writing the config happens rather seldom. So it should be okay if that takes a couple of milliseconds longer -- in the worst case.
My approach: Writing the config is only possible when acquiring a WriteGuard, which locks the "config write mutex" and waits for the signal (from the main loop) to start writing. The signal is needed to make writing the config synchronous with the main loop, and the mutex is needed to use the condition_variable but also to lock multiple writers from each other (which is less of a concern, to be honest).
I like this approach, because I like RAII. However, I am not sure if this is the best approach. Another idea would be to enqueue a callback which is then executed within the context of the main loop. We would need to copy and bind to that callback a significant amount of data, which seems off-putting. Also, access to the queue of callbacks then also need to be locked by a mutex.
@tbnobody I would like to know your opinion on this. Do you like this approach?
@LennartF22 We talked about this briefly. What are your thoughts on this approach?