-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eventually a removed flow is recreated when Kytos is under high usage (possible race condition) #128
Labels
Comments
Hi folks, Arturo (@ArturoQuintana) and I were testing Kytos/flow_manager and we found another way to demonstrate the racing condition: create 100 flows with parallel requests. How to reproduce:
Expected result: 100 flows should be created Actual results: some of the flows are lost... some of them are removed by the consistency check, just a few remain on the flow_manager storehouse. Example:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi folks,
The consistency check routine eventually recreates a flow when Kytos is under high usage (many requests in a short period of time), which indicates a possible race condition.
More specifically, the following end-to-end tests sometimes pass and sometimes fail:
As noted, the test basically creates 10 EVCs, waits a couple of seconds, makes sure the flows were created, removes the 10 evcs, waits a couple of seconds, makes sure the flows were removed, and so on. We repeat this procedure 10 times.
This test was run using the AmLight Kytos image (https://hub.docker.com/r/amlight/kytos), which is basically a master branch daily build image, with some PRs applied (e.g., #117 and more: https://github.com/amlight/kytos-docker/tree/master/patches). Also, we change the default STATS_INTERVAL setting to 3s and leave the CONSISTENCY_INTERVAL=0 (which means the consistency_check will run every 3s). The LINK_UP_TIMER is also changed to 1s.
This problem seems to be a race-condition situation because while the consistency check is running, other requests to remove EVCs are received for different switches and each thread seems to change the self.stored_flows in a uncontrolled manner (the shared resource).
I've enabled some logging and this is how it looks like (logs for the above test, cookie
0xaa28bc57b4ba8f4a == 12261257070396477258 ==> evc ID 28bc57b4ba8f4a
):In between those events, I could see that the function _store_changed_flows() sometimes overwrite a content previously changed. Attached the full log.
syslog-2021-05-10.log.gz
The text was updated successfully, but these errors were encountered: