Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adds a new CLUO operator and agent workloads for platform nodes (#226)
* Renames update-operator to update-operator-master since we want to have CLOU for platform nodes too. * Renames update-operator.jsonnet to update-operator-master.jsonnet since we need another operator for the platform too. * Adds a CLUO agent daemonset for platform nodes. * Renames update-agent.jsonnet to include a 'master' modifier to distinguish it from the agent file for platform nodes. * Updates system.jsonnet to find renamed update-operator workloads and the new update-operator workloads for platform nodes. * Adds a CLUO deployment for platform nodes. * Schedules the master cluster CLUO operator on the prometheus server cloud node. Previously it was on a master node, like a snake eating its tail. * Since the master cluster CLUO operator is on the Prometheus server, schedule the platform cluster CLUO operator there too, for consistency. * Adds -before-reboot-annotations to the CLUO operator deployments for both master and platform nodes. * Updates the -before-reboot-annotations flag values. * Adds an annotation to the master node so that CLUO for the master cluster can identify master nodes. * Adds the 'patch' verb to the things that the update operator role can do to nodes. * Renames the role file for CLUO to be simpler and shorter. * Adds a new template function CluoAnnotation(), and runs that function from each update-agent DaemonSet. * Fixes a number of syntax and format errors. * Fixes typo 'alpine:lastest' to use 'latest'. * Install curl and adds line continuation chars at end of each line of the CluoAnnotation initContainer. * Moves apk-update and apk-add to first line. * Uses the reboot-coordinator service account explicity rather than defining it and instead using the default namespace SA. * Attempts to use initContainer to write a node annotation script to a volume which the update-agent container will mount and runn before running the agent. * Adds a missing comma to update-agent-master.jsonnet. * Fixes the syntax and spelling of the volumeMounts sections of the initContainers. * Moves update-operator node annotation to a configmap. * Removes old reference to CluoAnnotation in templates.jsonnet. * Adds new configs for the update-operator. * Adds the update-operator ConfigMap to the reboot-coordinator namespace. * Changes the format of 'command' for update-agents to something that will hopefully work. * Runs annotate-node.sh with sh, since is isn't executable. * Adds sinqle quotes around patch JSON. * Uses double quotes instead of single quotes for JSON patch to avoid conflict with enclosing single quotes. * Wraps curl header fields in single quotes. * Removes single quotes from curl arguments. * Adds single quotes around JSON patch. * Removes single quote wrapping from JSON patch. * Changes update operator annotation from mlab-type-<type> to mlab-type/<type> for better readability. * Remove the annotation prefix because it didn't really comply with the prefix specification, which is suspposed to be a domain name-style identifier. * Goes back to using all dashes for the update operator node nnotations, because I just didn't like the underscore. * Removes the 'master' update opeartor and agent, since we will now just use update-operator for rolling reboots of platform nodes only. * Renames the 'platform' update operator and agent to just update-operator and update-agent, since we will now only run a single operator and agent for platform nodes only. * Removes the -master and -platform suffixes from update operator and agent DaemonSet and Deployment, and changes the --before-reboot-annotation to 'mlab-reboot-ok'. * Sets the update-operator to run on a master node again, now that we aren't using it to reboot master nodes. * Removes the special update-operator annotation from the master nodes, now that master nodes won't be rebooted by update-operator. * Adds a new reboot-node.service, along with a Timer to execute it once a day, as well as writing the file to be executed. * Uses the short weekday name instead of the long one. * Configures a 'reboot day' for each of the master nodes. Each one will reboot on a different day of the week. * Updates system.jsonnet to account for the fact that only a single update-operator and agent now exist. * Do not enable or start the reboot-node.service, as this will cause a reboot loop. The only thing that should run this service is its associated Timer. * Renames allocate_new_cloud_node.sh to something more intuitive for the platform cluster. * Calls newly renamed add_platform_cluster_cloud_node.sh. * Removes the facility for annotation a node for the update-operator, since that annotation will have to be applied by an operator or some operator script. * Expands a comment about tolerations. * Adds in an additional safety check in the reboot-node.service script such that a reboot will not occur if the etcd cluster does not have exactly 3 members. * Restructures reboot-node script with slightly easier to follow logic.
- Loading branch information