-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Data race if the provider set few times #307
Comments
Hey @erka, could you please try the test provider and see if that resolves your issue? |
@beeme1mr I am trying to implement the provider and write some e2e tests for it. Using another provider doesn't help much. The issue is related to the providers which implement |
In the next release, stage management moves to the provider. This is a non-breaking change but may address this problem. FYI @toddbaert |
There have been substantial changes to the SDK recently like @beeme1mr said, which are yet unreleased. The changes are significant enough that there's a chance this bug could be resolved. I would have liked to see the released already but there's a few more issues we have to iron out with it, namely the one you pointed out here: #296 (review) |
Just keep this alive, the release 1.14.0 doesn't fix the issue. go-sdk uses |
Thanks for the update @erka ! If you have some ideas for a fix, I'm open to that. Otherwise, we have some pressing issues internally so I probably won't be able to dig into this until next week. 🙏 |
Hi @toddbaert, we're having the same issue with our project. I think that just running cc @rdimitrov |
@blkt i gave it a quick shot and it seems to fix the problem. At least all tests pass, even with the race detector enabled. Cool.
@erka yea using
|
@warber We could add a |
I'll give this a shot this week unless I see a PR from somebody else first. |
My findings so far:
go func(forShutdown StateHandler) {
api.mu.RLock()
defer api.mu.RUnlock()
forShutdown.Shutdown()
}(v)
|
I'm going to give this a shot. Somebody stop me if you think I'm barking up the wrong tree. |
I didn't get to this, but @warber I've assigned to you also. If you don't have time please let me know and I'll pick it up. |
OK @warber pointed out to me this plan has other problems, namely if the same instance is set twice, it's still not possible to compare them. 😩 Zooming out a bit, the real pain here is that we can't compare provider references reliably. This could be resolved another way, I think, by modifying our functions to require the provider to be a pointer (and never a struct). I think in most real-world cases this is what we'll see anyway, but sometimes people (especially in testing) use structs. This is why we had to start doing If we simply require all args to |
@erka Actually provided a similar kind of idea here. So, by storing the kind as a field and comparing it to determine if we're dealing with a pointer or not, we ensure it works reliably only with pointers or structs that support deep comparison. Ultimately, I'm unsure which approach is better: enforcing the use of pointers as input parameters at runtime or keeping it flexible and documenting the limitation. I'm slightly leaning towards the latter solution. 🤔 |
From our perspective, this is an issue during tests only. In the application itself, we use My understanding of @erka's approach is that it would break provider implementors, and protect normal users form the change. If that's the case, I agree it might be preferable to just switching to a pointer. That said, my unsolicited opinion is that if provider shutdown is an important step, it should be synchronous regardless of its cost. I don't know whether that's an expensive operation, but I think that any optimization should be done inside the provider shutdown procedure, rather than outside. |
The issue is that provider shutdown can happen when a provider is no longer in use (for instance when a new one has been set) and can (as the spec says) involve flushing data such as telemetry etc, which is often asynchonous. Waiting for a provider that's no longer in use to clean up whatever it needs to because a new one has been set seems like a problem to me: as a user I don't really care if the old provider is taking a while to shut down, I just want to use my new one and consider any clean-up of the old a low-priority side effect. This is a bit different then when the shutdown is triggered on the API level, which blocks. The use-case in that situation is more like "my app is shutting down, please don't stop the process until my provider has shutdown gracefully". |
Observed behavior
I have several tests with a similar setup: the provider is created, then configured, the tests are executed, and
openfeature.Shutdown()
is called for cleanup. However, when these tests are run with the race detector enabled, they consistently fail.Expected Behavior
No response
Steps to reproduce
Create a file with test and run
go test -race ./...
The text was updated successfully, but these errors were encountered: