How does omni decide which nodes to reboot at the same time? #932

justyns · 2025-02-14T20:47:10Z

justyns
Feb 14, 2025

I have a few different node groups set up and I've noticed when applying patches or other configurations that require a reboot of the talos node, omni will reboot nodes in a way that occasionally causes downtime.

For example:

Out of these two groups, the first group is rebooting both "worker" nodes at the same time. Sometimes it reboots the two nginx nodes at the same time. Is there a documentation or configuration that defines the order the servers are rebooted in?

If I create a separate group for the workers, would it still potentially reboot both of them at the same time?

Answered by utkuozdemir

Feb 20, 2025

The decision of whether to reboot or not is based on what is changed in the machine configuration by that patch - some changes require reboot, some not, and if it requires a reboot, it will be done automatically.

Update strategy can be configured similarly to Kubernetes.
If you are using cluster templates, it is documented here: https://omni.siderolabs.com/reference/cluster-templates#updatestrategy

On the web UI, under the cluster scaling screen, you can click this button on the right side of the worker set to configure it:

The strategy applies per machine pool (control plane or worker), so separate groups' patch rollouts will be run in parallel, each respecting their own strategy.

Note …

View full answer

utkuozdemir · 2025-02-20T08:49:54Z

utkuozdemir
Feb 20, 2025
Maintainer

The decision of whether to reboot or not is based on what is changed in the machine configuration by that patch - some changes require reboot, some not, and if it requires a reboot, it will be done automatically.

Update strategy can be configured similarly to Kubernetes.
If you are using cluster templates, it is documented here: https://omni.siderolabs.com/reference/cluster-templates#updatestrategy

On the web UI, under the cluster scaling screen, you can click this button on the right side of the worker set to configure it:

The strategy applies per machine pool (control plane or worker), so separate groups' patch rollouts will be run in parallel, each respecting their own strategy.

Note that this only applies for config patches - upgrades work differently.

1 reply

justyns Feb 21, 2025
Author

Thanks @utkuozdemir , I missed the update strategy in the docs when I looked before. It looks like what I want, I'll test it out when I get a chance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does omni decide which nodes to reboot at the same time? #932

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How does omni decide which nodes to reboot at the same time? #932

justyns Feb 14, 2025

Replies: 1 comment · 1 reply

utkuozdemir Feb 20, 2025 Maintainer

justyns Feb 21, 2025 Author

justyns
Feb 14, 2025

Replies: 1 comment 1 reply

utkuozdemir
Feb 20, 2025
Maintainer

justyns Feb 21, 2025
Author