Name	Name	Last commit message	Last commit date
parent directory ..
terraform	terraform
.gitignore	.gitignore
README.md	README.md
kubeadm-aws-ha.drawio	kubeadm-aws-ha.drawio

Kubeadm HA on AWS EC2

This guide shows how to install a 5 node highly-available kubeadm cluster on AWS EC2 instances. If using the KodeKloud AWS Playground environment, please ensure you have selected region us-east-1 (N. Virginia) from the region selection at the top right of the AWS console. To maintain compatibility with the playground permissions, we will use the following EC2 instance configuration.

Instance type: t3.medium
Operating System: Ubuntu 22.04 (at time of writing)
Storage: gp2, 8GB

Note that this is an exercise in simply getting a cluster running and is a learning exercise only! It will not be suitable for serving workloads to the internet, nor will it be properly secured, otherwise this guide would be three times longer! It should not be used as a basis for building a production cluster.

1. Provision Infrastructure

We will provision the following infrastructure. The infrastructure will be created by Terraform, so as not to spend too much of the lab time just getting that provisioned, and to allow you to focus on the cluster installation.

As can be seen in this diagram, we will create five EC2 instances to form the cluster - 3 control planes and 2 workers, plus one load balancer to provide access to the API server endpoints and a further one student-node from which to perform the configuration. We build the infrastructure using Terraform from AWS CloudShell (so you don't have to install Terraform on your workstation), then log into student-node which can access the cluster nodes. This relationship between student-node and the cluster nodes is similar to CKA Ultimate Mocks and how the real exam works - you start on a separate node (in this case student-node), then use SSH to connect to cluster nodes. Note that SSH connections are only possible in the direction of the arrows. It is not possible to SSH from e.g. controlplane directly to node01. You must exit to student-node first. This is also how it is in the exam. student-node assumes the role of a bastion host.

We will also set up direct connection from your workstation to the node ports of the workers so that you can browse any NodePort services you create (see security below).

Some basic security will be configured:

Only the student-node will be able to access the cluster's API Server, and this is where you will run kubectl commands from when the cluster is running.
Only the student-node can SSH to the cluster nodes.
Ports required by Kubernetes itself (inc. etcd) and Calico CNI will be configured in security groups on the cluster nodes.

Security issues that would make this unsuitable for a genuine production cluster:

The kube nodes should be on private subnets (no direct access from the Internet) and placed behind a NAT gateway to allow them to download packages, or with a more extreme security posture, completely airgapped.
Access to API server and etcd would be more tightly controlled.
Use of default VPC is not recommended.
The node ports will be open to the world - i.e. anyone can connect to them.
A cloud load balancer coupled with an ingress controller would be provisioned to provide ingress to the cluster. It is definitely not recommended to expose the worker nodes' node ports to the Internet as we are doing here!!!

Other things that will be configured by the Terraform code

Host names set on the nodes: loadbalancer, controlplane01, controlplane02, controlplane03, node01, node02
Content of /etc/hosts set up on all nodes for easy use of ssh command from student-node.
Generation and distribution of a key pair for logging into instances via SSH.

Get Started

Let's go ahead and get the infrastructure built!

Click here to start a playground, and click START LAB to request a new AWS Cloud Playground instance. After a few seconds, you will receive a URL and your credentials to access AWS Cloud console.

Note that you must have KodeKloud Pro subscription to run an AWS playground. If you have your own AWS account, this should still work, however you will bear the cost for any resources created until you delete them.

We will run this entire lab in AWS CloudShell which is a Linux terminal you run inside the AWS console and has most of what we need preconfigured, such as git and the AWS credentials needed by Terraform. Click here to open CloudShell.

Install Terraform

From the CloudShell command prompt...

curl -O https://releases.hashicorp.com/terraform/1.6.2/terraform_1.6.2_linux_amd64.zip
unzip terraform_1.6.2_linux_amd64.zip
mkdir -p ~/bin
mv terraform ~/bin/
terraform version

Clone this repo

git clone https://github.com/kodekloudhub/certified-kubernetes-administrator-course.git

Now change into the aws-ha/terraform directory

cd certified-kubernetes-administrator-course/kubeadm-clusters/aws-ha/terraform

Provision the infrastructure

Run the terraform

terraform init
terraform plan
terraform apply

This should take about half a minute. If this all runs correctly, you will see something like the following at the end of all the output. IP addresses will be different for you

Apply complete! Resources: 43 added, 0 changed, 0 destroyed.

Outputs:

address_node01 = "54.224.201.244"
address_node02 = "44.213.109.108"
address_student_node = "3.92.232.115"
connect_student_node = "ssh [email protected]"

Copy all these outputs to a notepad for later use.

Wait for all instances to be ready (Instance state - running, Status check - 2/2 checks passed). This will take 2-3 minutes. See EC2 console.
Log into student-node

Copy the ssh command from the terraform output connect_student_node, e.g.
```
ssh [email protected]
```
Note that the IP address will be different for you.

Prepare the student node

We will install kubectl here so that we can run commands against the cluster when it is built

Install latest version of kubectl and place in the user programs directory

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin

Check
```
kubectl version
```
It should amongst other things tell you

The connection to the server localhost:8080 was refused - did you specify the right host or port?

which is fine, since we haven't installed kubernetes yet.

Configure the load balancer

Now we will install the load balancer that serves as the endpoint for connecting to API server. This will round-robin API server requests between each of the control plane nodes. For this we will use HAProxy in TCP load balancing mode. In this mode it simply forwards all traffic to its back ends (the control planes) without changing it e.g. doing SSL termination.

First, be logged into student-node as directed above.

Log into the load balancer
```
ssh loadbalancer
```
Become root (saves typing sudo before every command)
```
sudo -i
```
Update the apt package index and install packages needed for HAProxy:
```
apt-get update
apt-get install -y haproxy
```

Get IP addresses of the loadbalancer and 3 control planes and copy them to your notepad

dig +short loadbalancer
dig +short controlplane01
dig +short controlplane02
dig +short controlplane03

Create the HAProxy configuration file

First we'll delete the default configuration, then add our own

rm /etc/haproxy/haproxy.cfg
vi /etc/haproxy/haproxy.cfg

Now put the following content into the file. Replace L.L.L.L with the IP address of loadbalancer, and X.X.X.X with IPs for each control plane node

frontend kubernetes
    bind L.L.L.L:6443
    option tcplog
    mode tcp
    default_backend kubernetes-control-nodes

backend kubernetes-control-nodes
    mode tcp
    balance roundrobin
    option tcp-check
    server controlplane01 X.X.X.X:6443 check fall 3 rise 2
    server controlplane02 X.X.X.X:6443 check fall 3 rise 2
    server controlplane03 X.X.X.X:6443 check fall 3 rise 2

Restart and check haproxy
```
systemctl restart haproxy
systemctl status haproxy
```
It should be warning us that no backend is available - which is true because we haven't installed Kubernetes yet!
Exit from sudo and then back to student-node
```
exit
exit
```

Configure Operating System, Container Runtime and Kube Packages

First, be logged into student-node as directed above.

Repeat the following steps on controlplane01, controlplane02, controlplane03, node01 and node02 by SSH-ing from student-node to each cluster node in turn, e.g.

ubuntu@student-node:~$ ssh controlplane01
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.19.0-1028-aws x86_64)

Last login: Tue Jul 25 15:27:07 2023 from 172.31.93.38
ubuntu@controlplane01:~$

Note that there's no step to disable swap, since EC2 instances are by default with swap disabled.

Become root (saves typing sudo before every command)
```
sudo -i
```
Update the apt package index and install packages needed to use the Kubernetes apt repository:
```
apt-get update
apt-get install -y apt-transport-https ca-certificates curl
```

Set up the required kernel modules and make them persistent

cat <<EOF > /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

Set the required kernel parameters and make them persistent

cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sysctl --system

Install the container runtime
```
apt-get install -y containerd
```
Configure the container runtime to use systemd Cgroups. This part is the bit many students miss, and if not done results in a controlplane that comes up, then all the pods start crashlooping. kubectl will also fail with an error like The connection to the server x.x.x.x:6443 was refused - did you specify the right host or port?
1. Create default configuration
```
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
```
2. Edit the configuration to set up CGroups
```
vi /etc/containerd/config.toml
```
  Scroll down till you find a line with SystemdCgroup = false. Edit it to be SystemdCgroup = true, then save and exit vi
3. Restart containerd
```
systemctl restart containerd
```

Get latest version of Kubernetes and store in a shell variable

KUBE_LATEST=$(curl -L -s https://dl.k8s.io/release/stable.txt | awk 'BEGIN { FS="." } { printf "%s.%s", $1, $2 }')

Download the Kubernetes public signing key

mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

Add the Kubernetes apt repository

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/ /" > /etc/apt/sources.list.d/kubernetes.list

Update apt package index, install kubelet, kubeadm and kubectl, and pin their version

apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

Configure crictl in case we need it to examine running containers

crictl config \
    --set runtime-endpoint=unix:///run/containerd/containerd.sock \
    --set image-endpoint=unix:///run/containerd/containerd.sock

Exit root shell
```
exit
```
Return to student-node
```
exit
```
Repeat the above till you have done controlplane01, controlplane02, controlplane03, node01 and node02

Boot up controlplane

To create a highly available control plane, we install kubeadm on the first control plane node almost the same way as for a single control plane cluster, then we join the other control plane nodes in a similar manner to joining worker nodes

controlplane01

ssh to controlplane01
```
ssh controlplane01
```
Become root
```
sudo -i
```
Boot the first control plane using the IP address of the load balancer as the control plane endpoint
```
dig +short loadbalancer
```
Replace L.L.L.L with the IP address you got above
```
kubeadm init --control-plane-endpoint L.L.L.L:6443 --upload-certs --pod-network-cidr=192.168.0.0/16
```
Copy both join commands that are printed to a notepad for use on other control nodes and the worker nodes.

Install network plugin (calico). Weave does not work too well with HA clusters.

kubectl --kubeconfig /etc/kubernetes/admin.conf create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.3/manifests/tigera-operator.yaml
kubectl --kubeconfig /etc/kubernetes/admin.conf create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.3/manifests/custom-resources.yaml

Check we are up and running

kubectl --kubeconfig /etc/kubernetes/admin.conf get pods -n kube-system

Exit root shell
```
exit
```

Prepare the kubeconfig file for copying to student-node node

{
  sudo cp /etc/kubernetes/admin.conf .
  sudo chmod 666 admin.conf
}

Exit to student-node
```
exit
```

controlplane02 and controlplane03

Be on student-node

For each of controlplane02 and controlplane03

SSH to controlplane02
Become root
```
sudo -i
```
Paste the join command for control nodes that was output by kubeadm init on controlplane01
Exit back to student-node
```
exit
exit
```
Repeat the steps 2,3 and 4 on controlplane03

Configure kubectl on student-node

Copy kubeconfig down from controlplane01 to student-node and set proper permissions

mkdir -p ~/.kube
scp controlplane01:~/admin.conf ~/.kube/config
sudo chown $(id -u):$(id -g) ~/.kube/config
chmod 600 ~/.kube/config

Test it
```
kubectl get pods -n kube-system
```
You should now see that there are 3 pods for each of the main control plane components. Also if you look at the kubeconfig file in ~/.kube/config, you'll see that the IP addres for the server: entry is that of the load balancer.

Join the worker nodes

SSH to node01
Become root
```
sudo -i
```
Paste the join command for worker nodes that was output by kubeadm init on controlplane01
Return to student-node
```
exit
exit
```
Repeat the steps 2, 3 and 4 on node02
Now you should be back on student-node. Check all nodes are up
```
kubectl get nodes -o wide
```
There should now be 3 control nodes and 2 workers.

Create a test service

Run the following on student-node

Ensure all calico pods are running. They can take a while to initialise
```
watch kubectl get pods -n calico-system
```
Press CTRL-C to exit watch when pods are stable

Deploy and expose an nginx pod

kubectl run nginx --image nginx --expose --port 80

Convert the service to NodePort

kubectl edit service nginx

Edit the spec: part of the service until it looks like this. Don't change anything above spec:

spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 30080
  selector:
    run: nginx
  sessionAffinity: None
  type: NodePort

Get the public IP of one of the worker nodes to use in the following steps. These were output by Terraform as address_node01 and address_node02. You can also find this by looking at the instances on the EC2 page of the AWS console.
Test with curl on student-node

Replace the IP address with the one you chose from the above step
```
```bash
curl http://44.201.135.110:30080
```
```
Test from your own browser

Replace the IP address with the one you chose from the above step
```
```
http://44.201.135.110:30080
```
```

Notes on the terraform code

Those of you who are also studying our Terraform courses should look at the terraform files and try to understand what is happening here.

One point of note is that for the kubenode instances, we create network interfaces for them as separate resources, then attach these ENIs to the instances when they are built. The reason for this is so that the IP addresses of the instances can be known in advance, such that during instance creation /etc/hosts may be created by the user_data script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-ha

aws-ha

README.md

Kubeadm HA on AWS EC2

1. Provision Infrastructure

Get Started

Install Terraform

Clone this repo

Provision the infrastructure

Prepare the student node

Configure the load balancer

Configure Operating System, Container Runtime and Kube Packages

Boot up controlplane

controlplane01

controlplane02 and controlplane03

Configure kubectl on student-node

Join the worker nodes

Create a test service

Notes on the terraform code

Files

aws-ha

Directory actions

More options

Directory actions

More options

Latest commit

History

aws-ha

Folders and files

parent directory

README.md

Kubeadm HA on AWS EC2

1. Provision Infrastructure

Get Started

Install Terraform

Clone this repo

Provision the infrastructure

Prepare the student node

Configure the load balancer

Configure Operating System, Container Runtime and Kube Packages

Boot up controlplane

controlplane01

controlplane02 and controlplane03

Configure kubectl on student-node

Join the worker nodes

Create a test service

Notes on the terraform code