-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delay machine upgrade until all members are up-to-date #493
Merged
+439
−97
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dumbbell
force-pushed
the
delay-machine-upgrade
branch
4 times, most recently
from
December 26, 2024 10:39
2a179b8
to
267aa04
Compare
dumbbell
force-pushed
the
delay-machine-upgrade
branch
from
December 26, 2024 17:23
267aa04
to
602ffda
Compare
kjnilsson
requested changes
Jan 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, a few tweaks and a question as to whether we should optionally keep the old behaviour or not.
dumbbell
force-pushed
the
delay-machine-upgrade
branch
from
January 6, 2025 10:32
602ffda
to
782d1ab
Compare
dumbbell
force-pushed
the
delay-machine-upgrade
branch
2 times, most recently
from
January 7, 2025 11:14
1057a7d
to
6b69268
Compare
kjnilsson
reviewed
Jan 9, 2025
[Why] Before this patch, a Ra cluster would switch to a new machine version immediately after a leader with that version was elected. Because a leader can be elected with a quorum number of candidate voting for it, it means the cluster could start using the new machine version as soon as a quorum of members support that version. Unfortunately, other members that do not support it stop applying commands because they run an older version of the machine code. For some consumers of Ra, like Khepri, this means they could cease their operation locally until the member is restarted with the new machine version. We want to delay the machine upgrade to a point where all members know about the new version. This ensures all members can continue to provide their service. [How] The machine version to use is communicated by the leader using the `noop` command. This command is the first one sent just after an election. The machine version passed was the local machine version. With this patch, the `noop` command sent after an election passes the effective machine version, except if the leader is unclustered alone (in which case it passes the latest machine version. Therefore in a cluster, the leader will send a second `noop` command with a newer machine version later, once all members support it. To determine what each follower supports, this patch introduces two commands: * `#info_rpc{}` * `#info_reply{}` Once a leader is elected, in addition to the `noop` command, it sends an `#info_rpc{}` command to all followers. They reply with `#info_reply{}` with the machine version they support. This mechanism is not specific to machine upgrades: this could be extended in the future to communicate more details about each follower. Once the leader received the machine version of every followers, it can determine the highest possible supported machine version. For that, it simply takes the lowest reported machine version (including the leader's machine version). If this version is greater than the effective machine version, the leader sends a new `noop` command with the new machine version to use. The leader sends the `#info_rpc{}` command again and again to some followers at each "tick", if these followers did not report anything yet, or if the reported machine version is lower than its own supported machine version. This takes care of follower that did not receive the initial `#info_rpc{}` and those that were restarted as part of an upgrade. Fixes #490. V2: Address comments from @kjnilsson: * Use an empty map by default in `#info_reply{}` instead of `undefined`. This simplifies the handling of the reply with a single `lists:foldl/3` instead of two. * Merge `has_enough_peer_info/1` into `get_max_supported_machine_version/1`. * Add a system-level option to restore the Ra 2.15 behavior.
dumbbell
force-pushed
the
delay-machine-upgrade
branch
from
January 9, 2025 10:14
6b69268
to
e9c82a0
Compare
kjnilsson
approved these changes
Jan 9, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why
Before this patch, a Ra cluster would switch to a new machine version immediately after a leader with that version was elected.
Because a leader can be elected with a quorum number of candidate voting for it, it means the cluster could start using the new machine version as soon as a quorum of members support that version.
Unfortunately, other members that do not support it stop applying commands because they run an older version of the machine code. For some consumers of Ra, like Khepri, this means they could cease their operation locally until the member is restarted with the new machine version.
We want to delay the machine upgrade to a point where all members know about the new version. This ensures all members can continue to provide their service.
How
The machine version to use is communicated by the leader using the
noop
command. This command is the first one sent just after an election. The machine version passed was the local machine version.With this patch, the
noop
command sent after an election passes the effective machine version, except if the leader is unclustered alone (in which case it passes the latest machine version. Therefore in a cluster, the leader will send a secondnoop
command with a newer machine version later, once all members support it.To determine what each follower supports, this patch introduces two commands:
#info_rpc{}
#info_reply{}
Once a leader is elected, in addition to the
noop
command, it sends an#info_rpc{}
command to all followers. They reply with#info_reply{}
with the machine version they support. This mechanism is not specific to machine upgrades: this could be extended in the future to communicate more details about each follower.Once the leader received the machine version of every followers, it can determine the highest possible supported machine version. For that, it simply takes the lowest reported machine version (including the leader's machine version). If this version is greater than the effective machine version, the leader sends a new
noop
command with the new machine version to use.The leader sends the
#info_rpc{}
command again and again to some followers at each "tick", if these followers did not report anything yet, or if the reported machine version is lower than its own supported machine version. This takes care of follower that did not receive the initial#info_rpc{}
and those that were restarted as part of an upgrade.Fixes #490.