-
-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(cilium): add taint and install cilium first - WIP #275
Conversation
e967687
to
c053b3e
Compare
Signed-off-by: t1murl <[email protected]>
…to kustomizations Signed-off-by: t1murl <[email protected]>
Very good start @t1murl, thank you :) Just would love you to keep the HelmChart and values logic, as it's ideal IMHO. If not, why? |
@mysticaltech , thanks! 👍 |
Also just noticed the k3s docs stating:
It seems like the taint is not re-applied after a node reboot, potentially leaving the initial problem description unresolved by what I've done so far. I'll look up what we may be able to do with e.g. kured, i.e. before it issues the reboot it should taint the node, to extend the current draft. |
Got it! We can execute arbitrary commands with Kured. So maybe a custom bash would help resolve the situation?! |
Or a startup script for Cilium nodes, MicroOS offers that kind of functionality easily. |
@mysticaltech , that also sounds promising! I am not really familiar with MicroOS.. I reckon you mean this? Any help on this is more than welcome :) |
@t1murl Combustion or Ignition are for first boot. Am thinking more in terms of https://www.suse.com/c/easy-running-scripts-boot-and-shutdown/ |
FYI: Tainting the node while cilium is still running will result in cilium instantly deleting the taint again. I feel like we might want to taint the node when it is in a state where cilium is not running (anymore) and before pods start coming up (again). |
@mysticaltech From quickly reading into this, this indeed sounds great to me for correctly timing the tainting! Very good idea in my opinion 🙂👍 I feel like we should try this. Only need to find a neat way to run e.g. |
Perfect!! Yep, sounds good 😊 About kubeconfig, it is already being downloaded from the first control plane node, if I remember correctly in init.tf, so we could reupload that to worker nodes for instance! |
@t1murl I just came back from a break; I'll try to help with this ASAP. |
@PurpleBooth If you have some bandwidth, you know how much your contributions are appreciated! :) 🙏 |
Let me reflect back my understanding of the problem here. What we're trying to avoid containers that are not managed by the Cillium i.e. they were launched before cillium was fully initialised? From what I understand Cillium uses two techniques for this, firstly you can use this taint, or secondly you can restart the pods. We probably don't want to be randomly restarting pods, so the taint is a nicer option. However adding this taint means that the HelmController won't schedule the pod because the tolerations for The approach here is to use the helm CLI rather than the controller. Other approaches might include, adding a pod to add the toleration to the controller, to use the cilium cli (which is basically the same at the helm chart), or we could render using Dropped a question over at the HelmChart repository to see if they have any thoughts k3s-io/helm-controller#151 I find it hard to believe we are the only ones to have experienced this |
@PurpleBooth I did not realize that the Helm controller was in play even after Cilium is installed. Good catch! @t1murl Does that make sense to you? |
@PurpleBooth , thanks for sharing your reflection on this problem and thanks for opening k3s-io/helm-controller#151! 👍 We still have to implement something that accounts for node reboots after the initial deployment, i.e. the problem description I have provided in #270 where I've described how pods become unmanaged by Cilium after a node reboot by e.g. kured because they have been scheduled already and start faster than the Cilium pods when the nodes comes back up. To account for this, my understanding of a potentially feasible solution would be to use @mysticaltech proposal mentioned in #275 (comment) to realize tainting the node again (required due to #275 (comment)) when a reboot occurs (also see #275 (comment) regarding the timing & #275 (comment)). Of course this only accounts for cc: @mysticaltech PS: Unfortunately even my weekends are busy currently (birthdays & weddings to attend) so more days will pass until I can personally try out @mysticaltech proposal. Happy to hear you are back @mysticaltech to assist with this! 👍 |
@t1murl Thanks for clarifying! Helps a lots. As soon as any of us starts working on this, let's signal it here, so that we do not overlap on the same things. I'll do my best ASAP, if @PurpleBooth doesn't get this sorted before (she tends to be fast when she starts!) And looking forward for a reply to the issue she opened. |
For restarts we could do use Kured https://github.com/weaveworks/kured#adding-node-labels-before-and-after-reboots |
Excellent find, @PurpleBooth; That's another obstacle out of the way!! |
I believe it's now worth tweaking this PR and giving it a try as is. Maybe the Cilium containers will start after all. And if they successfully remove the label, that's it unless I do not see things as clearly as you. It's also important to remember that not all nodes restart simultaneously, and in an HA setting, at least two CP nodes would already have been bootstrapped and have Cilium running on them. @PurpleBooth What do you think? |
@PurpleBooth , @mysticaltech to avoid confusion: node labels != node taints :O Afaik node labels are not the same thing as node taints. I've already considered what we could do with kured, however I've not found a way to have it add a taint in a way that would help us in the context of this issue out of the box. I've then thought about why not have it execute a local script to taint the node before it issues the reboot cmd (as also mentioned in #275 (comment)). I've found this not feasible, however, due to #275 (comment). I hope this helps :) |
@t1murl My bad, I was confusing the two! |
@t1murl @PurpleBooth If the Cilium cli can solve our issues, then let's move away from the helm controller if needed. We can install it on the CP nodes at the moment when the k3s SELinux RPM is installed so that no additional reboots are needed. |
And I have verified how to run startup scripts; it's basically via As said above, the I also believe it's worth ignoring the tolerations of the Helm controller for now. Let's see if Cilium does not run if its taints are applied. If wich case, the wisest solution would be to move away from the Helm Controller towards the Cilium cli to install and run Cilium. |
Good news folks, there is a good chances that custom tolerations can be specified via the helm chart, in our case via the HelmChart definition. See this example for Longhorn, and other Rancher projects do the same. longhorn/longhorn#2664 |
Last but not least, various tolerations that Cilium support, as also supporting by the helm controller jobs. https://github.com/k3s-io/helm-controller/search?q=toleration&type=commits That said from what I gather custom tolerations in the helmchart yaml would go under "spec". |
@t1murl I will close this as I will redo most, to keep the current HelmChart method. But it's good inspiration, thank you! |
Changed my mind, merging instead so I have you in the git history @t1murl, it's very important to me as you've helped so much 🙏. I will compare from master and do the required changes. |
This is a WIP PR to fix #270 .
I've just had a few minutes to look at this so far so please consider this as me sharing my idea.
When I've had more time to test and finalize, I'll mark this ready for review.
However, please do not hesitate to edit if someone else finds time earlier than me 👍 .
I've added ToDo comments where I left off.
Edit: Done force pushing, sorry for that.