This repository is used to install a Talos Kubernetes cluster using on-prem Omni in a declarative manner. Most of these steps should work without modification. Obviously paths and domain names should change as required.
In my situation, I have 6 NUCs/mini-PCs that I use for my cluster. Three are old Intel NUCs that are used as control planes. The other 3 are Beelink mini-PCs. My desire was to setup an easily reproducible Talos Kubernetes cluster and maintain my creative node name strategy of NUC1 through NUC6.
Once complete, you will have a Kubernetes cluster running the latest Kubernetes flavour, but without a CNI. This means your cluster won't actually be running until a CNI is installed. I used Cilium for my cluster following these steps.
I installed Omni on a Raspberry Pi I'm using for other Docker-related stuff.
- Follow the Omni on-prem install instructions.
- Configure docker-compose.yaml file:
- The most important app is obviously
omni
traefik
is used for HTTPS traffic management and generates certificates via LetsEncrypt- The
dumpcerts
app creates .pem files from the LetsEncrypt certificates that can be consumed by Omni. See the next section for more information.
- The most important app is obviously
- Make sure an A record pointing to
omni.mydomain.com
is added to whatever is being used to serve DNS
Omni requires a public certificate for nodes to connect to. It is very important to keep this certificate up-to-date, or else things will start to go very bad when the certificate expires. When nodes reboot with an expired Omni certificate, Kubernetes pods will react in strange ways that will be hard to diagnose.
The Traefik
section of my docker-compose
file uses certificates issued by LetsEncrypt, but these certificates aren't available in file format that can be consumed by the Omni container. To get around this, I use a container called dumpcerts
to regularly create/update .pem
files from the LetsEncrypt certificate. Omni then consumes these certificates. The dumpcerts
container is a home-built container defined in my Docker repository
- Download omnictl and talosctl from https://omni.mydomain.com and put in proper locations on your workstation. For AMD machines:
sudo mv omnictl-linux-amd64 /usr/local/bin/omnictl
sudo mv talosctl-linux-amd64 /usr/local/bin/talosctl
sudo chmod u+x /usr/local/bin/omnictl /usr/local/bin/talosctl
For ARM machines:
sudo mv omnictl-linux-arm64 /usr/local/bin/omnictl
sudo mv talosctl-linux-arm64 /usr/local/bin/talosctl
sudo chmod u+x /usr/local/bin/omnictl /usr/local/bin/talosctl
- Download omniconfig.yaml and talosconfig.yaml from omni.mydomain.com and put in proper locations on your workstation.
mv omniconfig.yaml ~/.config/omni/config
mv talosconfig.yaml ~/.talos/config
Assumes Kubectl is already installed.
- Install Krew
(
set -x; cd "$(mktemp -d)" &&
OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
KREW="krew-${OS}_${ARCH}" &&
curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
tar zxvf "${KREW}.tar.gz" &&
./"${KREW}" install krew
)
- Add Krew path to ~/.bashrc
echo 'export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
- Install OIDC-Login in Kubectl
kubectl krew install oidc-login
If you're using WSL, install wslu
. This allows for external browser redirection from the WSL session to your main browser in Windows
sudo apt install wslu -y
Make sure all nodes are up and running in maintenance mode and are visible in https://omni.mydomain.com
You will need to modify the machine GUIDs in cluster-template-home.yaml to suit your needs. I have multiple cluster templates for home, lab and laptop to test various things. You may not need all this.
I setup a pass-through container cache in Docker on my NAS, which is defined in machine-registries.yaml. You probably won't be using this.
If any of your machine GUIDs are not randomly assigned and the BIOS is American Megatrends (AMI)-based, you may be able to create a bootable USB from the files in uuid-gen to set a random machine GUID.
I used PXEBoot and Matchbox for this. I will publish how I did this some other time.
Once you're ready for creating your cluster, run the below command from your workstation. Yep, that's it.
omnictl cluster template sync -f ~/omni/cluster-template-home.yaml
Then install Cilium using whatever method you desire. In my case, I used an Ansible script to install the core apps that would allow me to log into ArgoCD and install everything else:
- Cilium
- External Secrets
- Cert Manager
The repo that contains all that is currently private. I may expose it once I'm confident all secrets are gone.
If you're using a remote SSH shell to connect to the cluster, add the following to your ~/.ssh/config
Host myhost
LocalForward 8000 127.0.0.1:8000
LocalForward 18000 127.0.0.1:18000
Add - --skip-open-browser
to the Omni user account in the Users:
section of your ~/.kube/config
for Omni as in the example below:
users:
- name: [email protected]
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- oidc-login
- get-token
- --oidc-issuer-url=https://omni.mydomain.com/oidc
- --oidc-client-id=native
- --oidc-extra-scope=cluster:home
- --skip-open-browser
command: kubectl
env: null
interactiveMode: IfAvailable
provideClusterInfo: false
It is important to backup the Omni etcd database as well as the omni.asc
key in case of disaster. Here is a simple script to back this up. Requires installation of etcdctl
client.
sudo apt install etcd-client
This script takes a snapshot of the etcd database as well as the entire contents of the Omni folder. Keeps daily, weekly and monthly backups. This example goes to a NAS folder mount. Add to crontab to run it daily.
#!/bin/sh
ETCDCTL_API=3 etcdctl snapshot save /docker/omni/snapshot.db
day=$(date +%A)
dayofmonth=$(date +%-d)
echo "$(date +%F_%T) Backing up Omni etcd database..."
sudo zip -r /mnt/omni-backup/etcdbackup-$day.zip /docker/omni/
if [ "$dayofmonth" -eq 1 ]; then echo "Creating monthly backup..."; cp /mnt/omni-backup/etcdbackup-$day.zip /mnt/omni-backup/etcdbackup-monthly-$(date +%m).zip; fi
case $dayofmonth in 7|14|21|28) echo "Creating weekly backup..."; cp /mnt/omni-backup/etcdbackup-$day.zip /mnt/omni-backup/etcdbackup-weekly-$dayofmonth.zip; ;; *) ;; esac
echo "$(date +%F_%T) Omni etcd database has been backed up."
- Copy
omni.asc
to theomni
folder on your Docker host (or wherever the Omni Docker folder resides) - Copy
snapshot.db
to theomni
folder on your Docker host - Run the following commands to restore the Omni database:
cd /docker/omni
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db
mv default.etcd etcd
- Start the Omni container.