-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: upgrade all control plane nodes first #3444
base: burgerdev/k8s-1.31
Are you sure you want to change the base?
operator: upgrade all control plane nodes first #3444
Conversation
39f28da
to
5000cc9
Compare
5000cc9
to
14e650b
Compare
Only use feature flag on K8s versions that support that feature flag. Otherwise Constellation throws an error during init. Also fixup some tests.
This bump the controller gen version and also adjusts the generate commands (back to the original ones). This allows correct generation of CRDs and go code.
14e650b
to
5fcf88f
Compare
a612103
to
d0e8a38
Compare
maxNodeBudget: | ||
description: MaxNodeBudget is the maximum number of nodes that can | ||
be created simultaneously. | ||
format: int32 | ||
type: integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this problematic for upgrades, since we never upgrade the installed CRDs through helm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think helm has some upgrade magic build in regarding CRDs. But this time we only add one field that is optional, we are fine. Proof: https://github.com/edgelesssys/constellation/actions/runs/11455280858/job/31871905456.
Or do you have a concrete error in mind, I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, Helm simply doesn't do anything if a CRD already exists: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/#method-1-let-helm-do-it-for-you
So if we change the CRDs, running helm apply
won't actually apply these changes.
From what I can tell, this effectively makes the maxNodeBudget
option in the nodeversion crd non-functional.
I'm assuming everything still works fine because we default to 1
if the value is not set in the CR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks for the hint. Yes, the behavior is (hopefully) exactly the same. I think we are good for now then (also since the e2e test passed). I didn't want to advertise this feature anyway, but even if we do we should add this constraint.
In the future(tm), we likely want to do what others do (e.g., istio: https://istio.io/latest/docs/setup/upgrade/helm/#canary-upgrade-recommended).
operators/constellation-node-operator/controllers/nodeversion_controller.go
Show resolved
Hide resolved
operators/constellation-node-operator/controllers/scalinggroup_controller.go
Show resolved
Hide resolved
d0e8a38
to
a8ec800
Compare
Before we call out ot the cloud provider we check if there are still control plane nodes that are outdated (or donors). If there are, we don't create any worker nodes, even if we have the budget to do so.
a8ec800
to
72f85f3
Compare
Coverage report
|
@@ -19,4 +19,6 @@ const ( | |||
PlaceholderControlPlaneScalingGroupName = "control-planes-id" | |||
// PlaceholderWorkerScalingGroupName name of the worker scaling group used if upgrades are not yet supported. | |||
PlaceholderWorkerScalingGroupName = "workers-id" | |||
// ControlPlaneRoleLabel label used to identify control plane nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ControlPlaneRoleLabel label used to identify control plane nodes. | |
// ControlPlaneRoleLabel label used to identify control plane nodes. | |
// https://kubernetes.io/docs/reference/labels-annotations-taints/#node-role-kubernetes-io-control-plane |
Nit, just to document that this is canonical.
r.RLock() | ||
defer r.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need a write lock now?
@@ -21,6 +21,8 @@ type NodeVersionSpec struct { | |||
KubernetesComponentsReference string `json:"kubernetesComponentsReference,omitempty"` | |||
// KubernetesClusterVersion is the advertised Kubernetes version of the cluster. | |||
KubernetesClusterVersion string `json:"kubernetesClusterVersion,omitempty"` | |||
// MaxNodeBudget is the maximum number of nodes that can be created simultaneously. | |||
MaxNodeBudget uint32 `json:"maxNodeBudget,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm against making this a user-facing feature for now, because we did not discuss its semantics sufficiently. What if the user sets it to 1000 - do we replace all control-plane nodes at once? Should we have different budgets for control planes and for workers? Could this be relative? Should we rather implement a different upgrade algorithm (3533 etc)? I'm also reminded of the evolution of https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#rollingupdatedeployment-v1-apps.
Afaict this PR would be much smaller if we removed this feature and tried to find a different way to test it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we replace all control-plane nodes at once?
Yes, I also tested it setting the budget to 5 when I had 3:2 nodes. But "at-once" only applies to the world how the operator sees it. The join service still forbids multiple control-plane nodes joining at the same time. Note that this is the scenario right after initializing a Constellation with >=3 control planes. Also replace means, adding the nodes first. So in theory you could go from 3 control planes to 6 since the operator only removes a node once the hand over is finished.
Should we have different budgets for control planes and for workers?
We might do that in the future, if a customer requires it or we think we need it.
Could this be relative?
I assume as in a percentage value. Sure, but this is more difficult/complex then setting the number of nodes.
Should we rather implement a different upgrade algorithm
I think the bug is orthogonal to this proposal, since I'm not reworking the operator replacement algorithm, which would be a large undertaking in my opinion.
I'm also reminded of the evolution of https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#rollingupdatedeployment-v1-apps
I don't know the evolution, is there a summary somewhere of what happened?
Note that also the operator API is on version v1alpha1, so in my opinion, we don't have to provide any API stability guarantees between Constellation versions and we can completely change the whole upgrade process and APIs between Constellation versions.
Some of the current fields are not really user-facing in a sense that the user should directly use them. Changing the image reference requires 1. the image reference to be one of our images references 2. the measurements in the join config to match the image. Changing the k8s version requires a config map under that name and for the Constellation to upgrade correctly it has to contain the right set of components and patches.
Afaict this PR would be much smaller if we removed this feature and tried to find a different way to test it.
Then I'll have another try at the test for this, but this might take a bit of time before I get back to this.
Context
Allow #3396. Since kubelets must not communicate with a KubeAPI Server that has an older version than the kubelet itself, we need to upgrade all control planes, before upgrading the worker nodes. Control plane nodes are configured in a way that they only talk to the local KubeAPI server that matches the kubelet version.
Proposed change(s)
How to test:
Related issue
kubernetes/kubernetes#127316
Checklist
gcp-snp, 3:2, v1.30.4 -> v1.31.1 K8s, v1.18.0 -> head of this PR: https://github.com/edgelesssys/constellation/actions/runs/11431369883