Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Delay machine upgrade until all Ra servers support it
[Why] Before this patch, a Ra cluster would switch to a new machine version immediately after a leader with that version was elected. Because a leader can be elected with a quorum number of candidate voting for it, it means the cluster could start using the new machine version as soon as a quorum of members support that version. Unfortunately, other members that do not support it stop applying commands because they run an older version of the machine code. For some consumers of Ra, like Khepri, this means they could cease their operation locally until the member is restarted with the new machine version. We want to delay the machine upgrade to a point where all members know about the new version. This ensures all members can continue to provide their service. [How] The machine version to use is communicated by the leader using the `noop` command. This command is the first one sent just after an election. The machine version passed was the local machine version. With this patch, the `noop` command sent after an election passes the effective machine version, except if the leader is unclustered alone (in which case it passes the latest machine version. Therefore in a cluster, the leader will send a second `noop` command with a newer machine version later, once all members support it. To determine what each follower supports, this patch introduces two commands: * `#info_rpc{}` * `#info_reply{}` Once a leader is elected, in addition to the `noop` command, it sends an `#info_rpc{}` command to all followers. They reply with `#info_reply{}` with the machine version they support. This mechanism is not specific to machine upgrades: this could be extended in the future to communicate more details about each follower. Once the leader received the machine version of every followers, it can determine the highest possible supported machine version. For that, it simply takes the lowest reported machine version (including the leader's machine version). If this version is greater than the effective machine version, the leader sends a new `noop` command with the new machine version to use. The leader sends the `#info_rpc{}` command again and again to some followers at each "tick", if these followers did not report anything yet, or if the reported machine version is lower than its own supported machine version. This takes care of follower that did not receive the initial `#info_rpc{}` and those that were restarted as part of an upgrade. Fixes #490. V2: Address comments from @kjnilsson: * Use an empty map by default in `#info_reply{}` instead of `undefined`. This simplifies the handling of the reply with a single `lists:foldl/3` instead of two. * Merge `has_enough_peer_info/1` into `get_max_supported_machine_version/1`. * Add a system-level option to restore the Ra 2.15 behavior.
- Loading branch information