How does omni decide which nodes to reboot at the same time? #932
-
I have a few different node groups set up and I've noticed when applying patches or other configurations that require a reboot of the talos node, omni will reboot nodes in a way that occasionally causes downtime. For example: Out of these two groups, the first group is rebooting both "worker" nodes at the same time. Sometimes it reboots the two nginx nodes at the same time. Is there a documentation or configuration that defines the order the servers are rebooted in? If I create a separate group for the workers, would it still potentially reboot both of them at the same time? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The decision of whether to reboot or not is based on what is changed in the machine configuration by that patch - some changes require reboot, some not, and if it requires a reboot, it will be done automatically. Update strategy can be configured similarly to Kubernetes. On the web UI, under the cluster scaling screen, you can click this button on the right side of the worker set to configure it: The strategy applies per machine pool (control plane or worker), so separate groups' patch rollouts will be run in parallel, each respecting their own strategy. Note that this only applies for config patches - upgrades work differently. |
Beta Was this translation helpful? Give feedback.
The decision of whether to reboot or not is based on what is changed in the machine configuration by that patch - some changes require reboot, some not, and if it requires a reboot, it will be done automatically.
Update strategy can be configured similarly to Kubernetes.
If you are using cluster templates, it is documented here: https://omni.siderolabs.com/reference/cluster-templates#updatestrategy
On the web UI, under the cluster scaling screen, you can click this button on the right side of the worker set to configure it:
The strategy applies per machine pool (control plane or worker), so separate groups' patch rollouts will be run in parallel, each respecting their own strategy.
Note …