Delay machine upgrade until all members are up-to-date #493

dumbbell · 2024-12-24T12:21:07Z

Why

Before this patch, a Ra cluster would switch to a new machine version immediately after a leader with that version was elected.

Because a leader can be elected with a quorum number of candidate voting for it, it means the cluster could start using the new machine version as soon as a quorum of members support that version.

Unfortunately, other members that do not support it stop applying commands because they run an older version of the machine code. For some consumers of Ra, like Khepri, this means they could cease their operation locally until the member is restarted with the new machine version.

We want to delay the machine upgrade to a point where all members know about the new version. This ensures all members can continue to provide their service.

How

The machine version to use is communicated by the leader using the noop command. This command is the first one sent just after an election. The machine version passed was the local machine version.

With this patch, the noop command sent after an election passes the effective machine version, except if the leader is unclustered alone (in which case it passes the latest machine version. Therefore in a cluster, the leader will send a second noop command with a newer machine version later, once all members support it.

To determine what each follower supports, this patch introduces two commands:

#info_rpc{}
#info_reply{}

Once a leader is elected, in addition to the noop command, it sends an #info_rpc{} command to all followers. They reply with #info_reply{} with the machine version they support. This mechanism is not specific to machine upgrades: this could be extended in the future to communicate more details about each follower.

Once the leader received the machine version of every followers, it can determine the highest possible supported machine version. For that, it simply takes the lowest reported machine version (including the leader's machine version). If this version is greater than the effective machine version, the leader sends a new noop command with the new machine version to use.

The leader sends the #info_rpc{} command again and again to some followers at each "tick", if these followers did not report anything yet, or if the reported machine version is lower than its own supported machine version. This takes care of follower that did not receive the initial #info_rpc{} and those that were restarted as part of an upgrade.

Fixes #490.

kjnilsson

looks good, a few tweaks and a question as to whether we should optionally keep the old behaviour or not.

src/ra.hrl

src/ra_server.erl

@kjnilsson

[Why] Before this patch, a Ra cluster would switch to a new machine version immediately after a leader with that version was elected. Because a leader can be elected with a quorum number of candidate voting for it, it means the cluster could start using the new machine version as soon as a quorum of members support that version. Unfortunately, other members that do not support it stop applying commands because they run an older version of the machine code. For some consumers of Ra, like Khepri, this means they could cease their operation locally until the member is restarted with the new machine version. We want to delay the machine upgrade to a point where all members know about the new version. This ensures all members can continue to provide their service. [How] The machine version to use is communicated by the leader using the `noop` command. This command is the first one sent just after an election. The machine version passed was the local machine version. With this patch, the `noop` command sent after an election passes the effective machine version, except if the leader is unclustered alone (in which case it passes the latest machine version. Therefore in a cluster, the leader will send a second `noop` command with a newer machine version later, once all members support it. To determine what each follower supports, this patch introduces two commands: * `#info_rpc{}` * `#info_reply{}` Once a leader is elected, in addition to the `noop` command, it sends an `#info_rpc{}` command to all followers. They reply with `#info_reply{}` with the machine version they support. This mechanism is not specific to machine upgrades: this could be extended in the future to communicate more details about each follower. Once the leader received the machine version of every followers, it can determine the highest possible supported machine version. For that, it simply takes the lowest reported machine version (including the leader's machine version). If this version is greater than the effective machine version, the leader sends a new `noop` command with the new machine version to use. The leader sends the `#info_rpc{}` command again and again to some followers at each "tick", if these followers did not report anything yet, or if the reported machine version is lower than its own supported machine version. This takes care of follower that did not receive the initial `#info_rpc{}` and those that were restarted as part of an upgrade. Fixes #490. V2: Address comments from @kjnilsson: * Use an empty map by default in `#info_reply{}` instead of `undefined`. This simplifies the handling of the reply with a single `lists:foldl/3` instead of two. * Merge `has_enough_peer_info/1` into `get_max_supported_machine_version/1`. * Add a system-level option to restore the Ra 2.15 behavior.

dumbbell requested a review from kjnilsson December 24, 2024 12:21

dumbbell self-assigned this Dec 24, 2024

dumbbell force-pushed the delay-machine-upgrade branch 4 times, most recently from 2a179b8 to 267aa04 Compare December 26, 2024 10:39

dumbbell added this to the 2.16.0 milestone Dec 26, 2024

dumbbell force-pushed the delay-machine-upgrade branch from 267aa04 to 602ffda Compare December 26, 2024 17:23

dumbbell marked this pull request as ready for review December 26, 2024 17:40

kjnilsson requested changes Jan 2, 2025

View reviewed changes

src/ra.hrl Outdated Show resolved Hide resolved

src/ra_server.erl Outdated Show resolved Hide resolved

src/ra_server.erl Show resolved Hide resolved

src/ra_server.erl Outdated Show resolved Hide resolved

src/ra_server.erl Outdated Show resolved Hide resolved

dumbbell force-pushed the delay-machine-upgrade branch from 602ffda to 782d1ab Compare January 6, 2025 10:32

dumbbell marked this pull request as draft January 6, 2025 10:33

dumbbell requested a review from kjnilsson January 6, 2025 10:34

dumbbell force-pushed the delay-machine-upgrade branch 2 times, most recently from 1057a7d to 6b69268 Compare January 7, 2025 11:14

kjnilsson reviewed Jan 9, 2025

View reviewed changes

src/ra_server.erl Outdated Show resolved Hide resolved

dumbbell force-pushed the delay-machine-upgrade branch from 6b69268 to e9c82a0 Compare January 9, 2025 10:14

dumbbell marked this pull request as ready for review January 9, 2025 10:14

dumbbell requested a review from kjnilsson January 9, 2025 10:14

kjnilsson approved these changes Jan 9, 2025

View reviewed changes

kjnilsson merged commit 7a6b85d into main Jan 9, 2025
6 checks passed

dumbbell deleted the delay-machine-upgrade branch January 9, 2025 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay machine upgrade until all members are up-to-date #493

Delay machine upgrade until all members are up-to-date #493

dumbbell commented Dec 24, 2024 •

edited

Loading

kjnilsson left a comment

Delay machine upgrade until all members are up-to-date #493

Delay machine upgrade until all members are up-to-date #493

Conversation

dumbbell commented Dec 24, 2024 • edited Loading

Why

How

kjnilsson left a comment

Choose a reason for hiding this comment

dumbbell commented Dec 24, 2024 •

edited

Loading