From 6c30578c62175d1eeb340187a5239112f0da3557 Mon Sep 17 00:00:00 2001 From: csplinter Date: Tue, 4 Mar 2025 16:51:17 -0600 Subject: [PATCH 1/4] add content for hybrid nodes pod network routability and mixed mode clusters --- latest/ug/nodes/hybrid-nodes-add-ons.adoc | 62 +++++++++++++++++++- latest/ug/nodes/hybrid-nodes-networking.adoc | 24 ++++++-- 2 files changed, 79 insertions(+), 7 deletions(-) diff --git a/latest/ug/nodes/hybrid-nodes-add-ons.adoc b/latest/ug/nodes/hybrid-nodes-add-ons.adoc index f634bcde..a7e5a378 100644 --- a/latest/ug/nodes/hybrid-nodes-add-ons.adoc +++ b/latest/ug/nodes/hybrid-nodes-add-ons.adoc @@ -62,7 +62,67 @@ The sections that follow describe differences between running compatible {aws} a [#hybrid-nodes-add-ons-core] == kube-proxy and CoreDNS -EKS installs Kube-proxy and CoreDNS as self-managed add-ons by default when you create an EKS cluster with the {aws} API and {aws} SDKs, including from the {aws} CLI. You can overwrite these add-ons as Amazon EKS add-ons after cluster creation. Reference the EKS documentation for details on <> and <>. If you are running a cluster with hybrid nodes and nodes in {aws} Cloud, we recommend that you have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in {aws} Cloud. +EKS installs kube-proxy and CoreDNS as self-managed add-ons by default when you create an EKS cluster with the {aws} API and {aws} SDKs, including from the {aws} CLI. You can overwrite these add-ons as Amazon EKS add-ons after cluster creation. Reference the EKS documentation for details on <> and <>. + +If you are running a mixed mode cluster with both hybrid nodes and nodes in {aws} Cloud, we recommend that you have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in {aws} Cloud. CoreDNS can be configured such that your workloads will use the closest CoreDNS replica meaning your cloud workloads will use the CoreDNS running in the cloud and your hybrid workloads will use the CoreDNS running on hybrid nodes. See the steps below for how to configure CoreDNS for a mixed mode cluster. + +. Add a topology zone label for each of your hybrid nodes. This can alternatively be done at the `nodeadm init` phase. Note, cloud nodes automatically get a topology zone label applied to them. ++ +[source,yaml,subs="verbatim,attributes"] +---- +kubectl label node topology.kubernetes.io/zone= +---- ++ +. Add `podAntiAffinity` to the CoreDNS deployment configuration for the topology zone key. You can alternatively configure the CoreDNS deployment during installation with EKS add-ons. ++ +[source,yaml,subs="verbatim,attributes"] +---- +kubectl edit deployment coredns -n kube-system +---- ++ +[source,yaml,subs="verbatim,attributes"] +---- +spec: + template: + spec: + affinity + ... + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - podAffinityTerm: + labelSelector: + matchExpressions: + - key: k8s-app + operator: In + values: + - kube-dns + topologyKey: kubernetes.io/hostname + weight: 100 + - podAffinityTerm: + labelSelector: + matchExpressions: + - key: k8s-app + operator: In + values: + - kube-dns + topologyKey: topology.kubernetes.io/zone + weight: 50 + ... +---- ++ +. Add `trafficDistribution` to the kube-dns Service configuration. ++ +[source,yaml,subs="verbatim,attributes"] +---- +kubectl edit service kube-dns -n kube-system +---- ++ +[source,yaml,subs="verbatim,attributes"] +---- +spec: + ... + trafficDistribution: PreferClose +---- [#hybrid-nodes-add-ons-cw] == CloudWatch Observability agent diff --git a/latest/ug/nodes/hybrid-nodes-networking.adoc b/latest/ug/nodes/hybrid-nodes-networking.adoc index b23ed3a1..5600a761 100644 --- a/latest/ug/nodes/hybrid-nodes-networking.adoc +++ b/latest/ug/nodes/hybrid-nodes-networking.adoc @@ -25,14 +25,26 @@ For an optimal experience, {aws} recommends reliable network connectivity of at *On-premises node and pod CIDRs* -Identify the node and pod CIDRs you will use for your hybrid nodes and the workloads running on them. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from your Container Network Interface (CNI) if you are using an overlay network for your CNI. You pass your on-premises node CIDRs and optionally pod CIDRs as inputs when you create your Amazon EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` fields. +Identify the node and pod CIDRs you will use for your hybrid nodes and the workloads running on them. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from your Container Network Interface (CNI) if you are using an overlay network for your CNI. You pass your on-premises node CIDRs and optionally pod CIDRs as inputs when you create your EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` fields. The on-premises node and pod CIDR blocks must meet the following requirements: 1. Be within one of the following `IPv4` RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`. -2. Not overlap with each other, the VPC CIDR for your Amazon EKS cluster, or your Kubernetes service `IPv4` CIDR. +2. Not overlap with each other, the VPC CIDR for your EKS cluster, or your Kubernetes service `IPv4` CIDR. -If your CNI performs Network Address Translation (NAT) for pod traffic as it leaves your on-premises hosts, you do not need to advertise your pod CIDR to your on-premises network or configure your Amazon EKS cluster with your _remote pod network_ for hybrid nodes to become ready to workloads. If your CNI does not use NAT for pod traffic as it leaves your on-premises hosts, you must advertise your pod CIDR with your on-premises network and you must configure your Amazon EKS cluster with your remote pod network for hybrid nodes to become ready to workloads. If you are running webhooks on your hybrid nodes, you must advertise your pod CIDR to your on-premises network and configure your Amazon EKS cluster with your remote pod network so the Amazon EKS control plane can directly connect to the webhooks running on hybrid nodes. +If your CNI performs Network Address Translation (NAT) for pod traffic as it leaves your on-premises hosts, you do not need to make your pod CIDR routable on your on-premises network or configure your EKS cluster with your _remote pod network_ for hybrid nodes to become ready to workloads. If your CNI does not use NAT for pod traffic as it leaves your on-premises hosts, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network for hybrid nodes to become ready to workloads. + +If you are running webhooks on hybrid nodes, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network so the EKS control plane can directly communicate with the webhooks running on hybrid nodes. For more information on the EKS add-ons that use webhooks, see <>. If you cannot make your pod CIDR routable on your on-premises network but need to run webhooks, it is recommended to run webhooks on cloud nodes in the same EKS cluster. See the section below on running mixed mode clusters that have both hybrid and cloud nodes. + +There are several techniques you can use to make your pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. AWS supports the BGP capabilities of Cilium and Calico for advertising hybrid nodes pod CIDRs, see <> for more information. + +*Mixed mode clusters* + +Mixed mode clusters are defined as EKS clusters that have both hybrid and cloud nodes. When running a mixed mode cluster, it is recommended to run the VPC CNI on cloud nodes and either Cilium or Calico on hybrid nodes. Cilium and Calico are not supported by AWS when running on cloud nodes. It is recommended to run at least one replica of CoreDNS on cloud nodes and at least one replica of CoreDNS on hybrid nodes, see <> for configuration steps. See <> for how to configure the webhooks used by EKS add-ons to run on cloud nodes. + +If your applications require pods running on cloud nodes to directly communicate with pods running on hybrid nodes ("east-west communication"), then your pod CIDR must be routable on your on-premises network when using the VPC CNI on cloud nodes and Cilium or Calico in overlay/tunnel mode on hybrid nodes. In this setup, the pods in the cloud get IP addresses from the VPC CNI based on the VPC subnet configuration and communicate over the VPC network. If your on-premises pod CIDR range is configured in your VPC routing table with a route to your gateway (TGW or VGW), then the traffic destined for pods on hybrid nodes will be routed to the gateway and on to your on-premises network. When the traffic reaches your on-premises network, your router must know which hybrid node owns the IP address space of the destination pod, which is why your pod CIDR must be routable on your on-premises network for this communication path to work. + +For additional information on running mixed mode clusters and the networking options available, see the link:https://docs.aws.amazon.com/eks/latest/best-practices/hybrid.html[EKS Best Practices Guide for hybrid deployments]. *Access required during hybrid node installation and upgrade* @@ -195,12 +207,12 @@ Depending on your choice of CNI, you need to configure additional network access [NOTE] ==== -^1^ The IPs of the Amazon EKS cluster. See the following section on Amazon EKS elastic network interfaces. +^1^ The IPs of the EKS cluster. See the following section on Amazon EKS elastic network interfaces. ==== *Amazon EKS network interfaces* -Amazon EKS attaches network interfaces to the subnets in the VPC you pass during cluster creation to enable the communication between the Amazon EKS control plane and your VPC. The network interfaces that Amazon EKS creates can be found after cluster creation in the Amazon EC2 console or with the {aws} CLI. The original network interfaces are deleted and new network interfaces are created when changes are applied on your Amazon EKS cluster, such as Kubernetes version upgrades. You can restrict the IP range for the Amazon EKS network interfaces by using constrained subnet sizes for the subnets you pass during cluster creation, which makes it easier to configure your on-premises firewall to allow inbound/outbound connectivity to this known, constrained set of IPs. To control which subnets network interfaces are created in, you can limit the number of subnets you specify when you create a cluster or you can update the subnets after creating the cluster. +Amazon EKS attaches network interfaces to the subnets in the VPC you pass during cluster creation to enable the communication between the EKS control plane and your VPC. The network interfaces that Amazon EKS creates can be found after cluster creation in the Amazon EC2 console or with the {aws} CLI. The original network interfaces are deleted and new network interfaces are created when changes are applied on your EKS cluster, such as Kubernetes version upgrades. You can restrict the IP range for the Amazon EKS network interfaces by using constrained subnet sizes for the subnets you pass during cluster creation, which makes it easier to configure your on-premises firewall to allow inbound/outbound connectivity to this known, constrained set of IPs. To control which subnets network interfaces are created in, you can limit the number of subnets you specify when you create a cluster or you can update the subnets after creating the cluster. The network interfaces provisioned by Amazon EKS have a description of the format `Amazon EKS [.replaceable]``your-cluster-name```. See the example below for an {aws} CLI command you can use to find the IP addresses of the network interfaces that Amazon EKS provisions. Replace `VPC_ID` with the ID of the VPC you pass during cluster creation. @@ -344,7 +356,7 @@ aws ec2 associate-route-table --route-table-id [.replaceable]`RT_ID` --subnet-id [#hybrid-nodes-networking-cluster-sg] == Cluster security group configuration -The following access for your Amazon EKS cluster security group is required for ongoing cluster operations. +The following access for your EKS cluster security group is required for ongoing cluster operations. [cols="1,1,1,1,1,1,1", options="header"] |=== From dcb1ddf7c235cbbdf5216adac79acdaa630c3f1a Mon Sep 17 00:00:00 2001 From: csplinter Date: Wed, 5 Mar 2025 14:27:52 -0600 Subject: [PATCH 2/4] CNI docs improvements, add considerations section --- latest/ug/nodes/hybrid-nodes-cni.adoc | 64 +++++++++++++------- latest/ug/nodes/hybrid-nodes-networking.adoc | 2 +- 2 files changed, 44 insertions(+), 22 deletions(-) diff --git a/latest/ug/nodes/hybrid-nodes-cni.adoc b/latest/ug/nodes/hybrid-nodes-cni.adoc index 7e240fd7..23769f9e 100644 --- a/latest/ug/nodes/hybrid-nodes-cni.adoc +++ b/latest/ug/nodes/hybrid-nodes-cni.adoc @@ -8,20 +8,20 @@ include::../attributes.txt[] [abstract] -- -Configure a CNI for Amazon EKS hybrid nodes +Configure a CNI for hybrid nodes -- Cilium and Calico are supported as the Container Networking Interfaces (CNIs) for Amazon EKS Hybrid Nodes. You must install a CNI for hybrid nodes to become ready to serve workloads. Hybrid nodes appear with status `Not Ready` until a CNI is running. You can manage these CNIs with your choice of tools such as Helm. The Amazon VPC CNI is not compatible with hybrid nodes and the VPC CNI is configured with anti-affinity for the `eks.amazonaws.com/compute-type: hybrid` label. == CNI version compatibility -Calico version `3.29.x` is supported and recommended for EKS Hybrid Nodes for every Kubernetes version supported in Amazon EKS. - Cilium version `1.16.x` is supported and recommended for EKS Hybrid Nodes for every Kubernetes version supported in Amazon EKS. +Calico version `3.29.x` is supported and recommended for EKS Hybrid Nodes for every Kubernetes version supported in Amazon EKS. + == Supported capabilities -{aws} supports the following capabilities of Cilium and Calico for use with hybrid nodes. If you plan to use functionality outside the scope of {aws} support, we recommend that you obtain commercial support for the plugin or have the in-house expertise to troubleshoot and contribute fixes to the CNI plugin project. +{aws} provides technical support for the following capabilities of Cilium and Calico for use with hybrid nodes. If you plan to use functionality outside the scope of {aws} support, we recommend that you obtain commercial support for the plugin or have the in-house expertise to troubleshoot and contribute fixes to the CNI plugin project. [cols="1,1,1", options="header"] @@ -63,9 +63,17 @@ Cilium version `1.16.x` is supported and recommended for EKS Hybrid Nodes for ev |Yes |=== +== Cilium considerations + +- By default, Cilium is configured to run in overlay / tunnel mode with VXLAN as the link:https://docs.cilium.io/en/stable/network/concepts/routing/#encapsulation[encapsulation method]. This mode has the fewest requirements on the underlying physical network. +- By default, Cilium link:https://docs.cilium.io/en/stable/network/concepts/masquerading/[masquerades] the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible to run Cilium with Amazon EKS clusters that have remote pod networks configured and with clusters that don't have remote pod networks configured. If you disable masquerading, then your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. +- If you are running webhooks on your hybrid nodes, your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. If your pod CIDRs are not routable on your on-premises network, then it is recommended to run webhooks on cloud nodes in the same cluster. See <> for more information. +- A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Cilium, you must set `bgpControlPlane.enabled: true` in your Helm configuration. For more information on Cilium's BGP support, see https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane/[Cilium BGP Control Plane] in the Cilium documentation. +- The default IP Address Management (IPAM) in Cilium is called link:https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/[Cluster Scope], where the Cilium operator allocates IP addresses for each node based on user-configured pod CIDRs. The pod CIDRs are configured with the `clusterPoolIPv4PodCIDRList` Helm value. Cilium allocates segments from the `clusterPoolIPv4PodCIDRList` to each node. The size of the per node segments is configured with the `clusterPoolIPv4MaskSize` Helm value. The `clusterPoolIPv4PodCIDRList` should match the remote pod network CIDRs you configured for your Amazon EKS cluster. For more information on the `clusterPoolIPv4PodCIDRList` and `clusterPoolIPv4MaskSize`, see https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool[Expanding the cluster pool] in the Cilium documentation. + == Install Cilium on hybrid nodes -. Ensure that you have installed the helm CLI on your command-line environment. See the https://helm.sh/docs/intro/quickstart/[Helm documentation] for installation instructions. +. Ensure that you have installed the Helm CLI on your command-line environment. See the https://helm.sh/docs/intro/quickstart/[Helm documentation] for installation instructions. . Install the Cilium Helm repo. + [source,bash,subs="verbatim,attributes"] @@ -73,11 +81,12 @@ Cilium version `1.16.x` is supported and recommended for EKS Hybrid Nodes for ev helm repo add cilium https://helm.cilium.io/ ---- -. Create a YAML file called `cilium-values.yaml`. If you configured at least one _remote pod network_, configure the same pod CIDRs for your `clusterPoolIPv4PodCIDRList`. You shouldn't change your `clusterPoolIPv4PodCIDRList` after deploying Cilium on your cluster. You can configure `clusterPoolIPv4MaskSize` based on your required pods per node, see https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool[Expanding the cluster pool] in the Cilium documentation. For a full list of Helm values for Cilium, see the https://docs.cilium.io/en/stable/helm-reference/[Helm reference] in the Cilium documentation. The following example configures all of the Cilium components to run on only the hybrid nodes, since they have the `eks.amazonaws.com/compute-type: hybrid` label. -+ -By default, Cilium masquerades the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible for Cilium to run with Amazon EKS clusters that have remote pod networks configured and with clusters that don't have remote pod networks configured. If you disable masquerading for your Cilium deployment, then you must configure your Amazon EKS cluster with your remote pod networks and you must advertise your pod addresses with your on-premises network. If you are running webhooks on your hybrid nodes, you must configure your cluster with your remote pod networks and you must advertise your pod addresses with your on-premises network. -+ -A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Cilium, you must set `bgpControlPlane.enabled: true`. For more information on Cilium's BGP support, see https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane/[Cilium BGP Control Plane] in the Cilium documentation. +. Create a YAML file called `cilium-values.yaml`. The following example configures Cilium to run on hybrid nodes only by setting affinity for the `eks.amazonaws.com/compute-type: hybrid` label. + +- If you configured your Amazon EKS cluster with _remote pod networks_, configure the same pod CIDRs for your `clusterPoolIPv4PodCIDRList`. For example, `10.100.0.0/24`. +- Configure `clusterPoolIPv4MaskSize` based on your required pods per node. For example, `25` for a /25 segment size of 128 pods per node. +- You should not change your `clusterPoolIPv4PodCIDRList` or `clusterPoolIPv4MaskSize` after deploying Cilium on your cluster, see https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool[Expanding the cluster pool] in the Cilium documentation. +- For a full list of Helm values for Cilium, see the https://docs.cilium.io/en/stable/helm-reference/[Helm reference] in the Cilium documentation. + [source,bash,subs="verbatim,attributes"] ---- @@ -93,9 +102,9 @@ affinity: ipam: mode: cluster-pool operator: - clusterPoolIPv4MaskSize: 25 + clusterPoolIPv4MaskSize: [.replaceable]`25` clusterPoolIPv4PodCIDRList: - - POD_CIDR + - [.replaceable]`POD_CIDR` operator: affinity: nodeAffinity: @@ -112,7 +121,10 @@ envoy: enabled: false ---- -. Install Cilium on your cluster. Replace `CILIUM_VERSION` with your desired Cilium version. It is recommended to run the latest patch version for your Cilium minor version. You can find the latest patch release for a given minor Cilium release in the https://github.com/cilium/cilium#stable-releases[Stable Releases section] of the Cilium documentation. If you are enabling BGP for your deployment, add the `--set bgpControlPlane.enabled=true` flag in the command below. If you are using a specific kubeconfig file, use the `--kubeconfig` flag with the Helm install command. +. Install Cilium on your cluster. +- Replace `CILIUM_VERSION` with your desired Cilium version. It is recommended to run the latest patch version for your Cilium minor version. You can find the latest patch release for a given minor Cilium release in the https://github.com/cilium/cilium#stable-releases[Stable Releases section] of the Cilium documentation. +- If you are enabling BGP for your deployment, add the `--set bgpControlPlane.enabled=true` flag in the command below. +- If you are using a specific kubeconfig file, use the `--kubeconfig` flag with the Helm install command. + [source,bash,subs="verbatim,attributes,quotes"] ---- @@ -338,6 +350,14 @@ The interfaces and routes configured by Cilium are not removed by default when t kubectl get crds -oname | grep "cilium" | xargs kubectl delete ---- +== Calico considerations + +- It is recommended to run Calico in overlay / tunnel mode with VXLAN as the link:https://docs.tigera.io/calico/latest/networking/configuring/vxlan-ipip[encapsulation method]. This mode has the fewest requirements on the underlying physical network. For more information on the different Calico networking modes, see https://docs.tigera.io/calico/latest/networking/determine-best-networking[Determining the best networking option] in the Calico documentation. +- It is recommended to run Calico with `natOutgoing` set to `true`. With `natOutgoing` set to `true` the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible to run Calico with Amazon EKS clusters that have remote pod networks configured and with clusters that don't have remote pod networks configured. If you disable `natOutgoing`, then your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. +- If you are running webhooks on your hybrid nodes, your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. If your pod CIDRs are not routable on your on-premises network, then it is recommended to run webhooks on cloud nodes in the same cluster. See <> for more information. +- A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Calico, you must set `installation.calicoNetwork.bgp: Enabled` in your Helm configuration. For more information on Calico's BGP support, see link:https://docs.tigera.io/calico/latest/networking/configuring/bgp[Configure BGP peering] in the Calico documentation. +- The default IP Address Management (IPAM) in Calico is called link:https://docs.tigera.io/calico/latest/networking/ipam/get-started-ip-addresses#calico-ipam[Calico IPAM], where the `calico-ipam` plugin allocates IP addresses for each node based on user-configured pod CIDRs. The pod CIDRs are configured with the `installation.calicoNetwork.ipPools.cidr` Helm value. Calico allocates segments from the `ipPools.cidr` to each node. The size of the per node segments is configured with the `ipPools.blockSize` Helm value. The `ipPools.cidr` should match the remote pod network CIDRs you configured for your Amazon EKS cluster. For more information on IPAM with Calico, see link:https://docs.tigera.io/calico/latest/networking/ipam/get-started-ip-addresses[Get started with IP address management] in the Calico documentation. + == Install Calico on hybrid nodes . Ensure that you have installed the helm CLI on your command-line environment. See the https://helm.sh/docs/intro/quickstart/[Helm documentation] for installation instructions. @@ -348,10 +368,10 @@ kubectl get crds -oname | grep "cilium" | xargs kubectl delete helm repo add projectcalico https://docs.tigera.io/calico/charts ---- -. Create a YAML file called `calico-values.yaml` that configures Calico with affinity to run on hybrid nodes. For more information on the different Calico networking modes, see https://docs.tigera.io/calico/latest/networking/determine-best-networking[Determining the best networking option] in the Calico documentation. -.. Replace `POD_CIDR` with the CIDR ranges for your pods. If you configured your Amazon EKS cluster with remote pod networks, the `POD_CIDR` that you specify for Calico should be the same as the remote pod networks. For example, `10.100.0.0/24`. -.. Replace `CIDR_SIZE` with the size of the CIDR segment you want to allocate to each node. For example, `25` for a /25 segment size. For more information on CIDR `blockSize` and changing the `blockSize`, see https://docs.tigera.io/calico/latest/networking/ipam/change-block-size[Change IP pool block size] in the Calico documentation. -.. In the example below, `natOutgoing` is enabled and `bgp` is disabled. In this configuration, Calico can run on Amazon EKS clusters that have Remote Pod Network configured and can run on clusters that do not have Remote Pod Network configured. If you have `natOutgoing` set to disabled, you must configure your cluster with your remote pod networks and your on-premises network must be able to properly route traffic destined for your pod CIDRs. A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Calico, you must enable `bgp`. The example below configures all of the Calico components to run on only the hybrid nodes, since they have the `eks.amazonaws.com/compute-type: hybrid` label. If you are running webhooks on your hybrid nodes, you must configure your cluster with your Remote Pod Networks and you must advertise your pod addresses with your on-premises network. The example below configures `controlPlaneReplicas: 1`, increase the value if you have multiple hybrid nodes and want to run the Calico control plane components in a highly available fashion. +. Create a YAML file called `calico-values.yaml`. The following example configures all Calico components to run on hybrid nodes only by setting affinity for the `eks.amazonaws.com/compute-type: hybrid` label. +- Replace `POD_CIDR` with the CIDR ranges for your pods. If you configured your Amazon EKS cluster with remote pod networks, the `POD_CIDR` that you specify for Calico should be the same as the remote pod networks. For example, `10.100.0.0/24`. +- Replace `CIDR_SIZE` with the size of the CIDR segment you want to allocate to each node. For example, `25` for a /25 segment size of 128 pod addresses per node. For more information on CIDR `blockSize` and changing the `blockSize`, see https://docs.tigera.io/calico/latest/networking/ipam/change-block-size[Change IP pool block size] in the Calico documentation. +- In the example below, `natOutgoing` is enabled and `bgp` is disabled. Modify these values based on your target configuration. + [source,yaml,subs="verbatim,attributes"] ---- @@ -362,10 +382,10 @@ installation: ipam: type: Calico calicoNetwork: - bgp: Disabled + bgp: [.replaceable]`Disabled` ipPools: - - cidr: POD_CIDR - blockSize: CIDR_SIZE + - cidr: [.replaceable]`POD_CIDR` + blockSize: [.replaceable]`CIDR_SIZE` encapsulation: VXLAN natOutgoing: Enabled nodeSelector: eks.amazonaws.com/compute-type == "hybrid" @@ -398,7 +418,9 @@ installation: eks.amazonaws.com/compute-type: hybrid ---- -. Install Calico on your cluster. Replace `CALICO_VERSION` with your desired Calico version (for example 3.29.0), see the https://github.com/projectcalico/calico/releases[Calico releases] to find the latest patch release for your Calico minor version. It is recommended to run the latest patch version for the Calico minor version. If you are using a specific `kubeconfig` file, use the `--kubeconfig` flag. +. Install Calico on your cluster. +- Replace `CALICO_VERSION` with your desired Calico version (for example 3.29.0), see the https://github.com/projectcalico/calico/releases[Calico releases] to find the latest patch release for your Calico minor version. It is recommended to run the latest patch version for the Calico minor version. +- If you are using a specific `kubeconfig` file, use the `--kubeconfig` flag. + [source,bash,subs="verbatim,attributes,quotes"] ---- diff --git a/latest/ug/nodes/hybrid-nodes-networking.adoc b/latest/ug/nodes/hybrid-nodes-networking.adoc index 5600a761..e6a0caaf 100644 --- a/latest/ug/nodes/hybrid-nodes-networking.adoc +++ b/latest/ug/nodes/hybrid-nodes-networking.adoc @@ -34,7 +34,7 @@ The on-premises node and pod CIDR blocks must meet the following requirements: If your CNI performs Network Address Translation (NAT) for pod traffic as it leaves your on-premises hosts, you do not need to make your pod CIDR routable on your on-premises network or configure your EKS cluster with your _remote pod network_ for hybrid nodes to become ready to workloads. If your CNI does not use NAT for pod traffic as it leaves your on-premises hosts, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network for hybrid nodes to become ready to workloads. -If you are running webhooks on hybrid nodes, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network so the EKS control plane can directly communicate with the webhooks running on hybrid nodes. For more information on the EKS add-ons that use webhooks, see <>. If you cannot make your pod CIDR routable on your on-premises network but need to run webhooks, it is recommended to run webhooks on cloud nodes in the same EKS cluster. See the section below on running mixed mode clusters that have both hybrid and cloud nodes. +If you are running webhooks on hybrid nodes, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network so the EKS control plane can directly communicate with the webhooks running on hybrid nodes. If you cannot make your pod CIDR routable on your on-premises network but need to run webhooks, it is recommended to run webhooks on cloud nodes in the same EKS cluster. For more information on the EKS add-ons that use webhooks, see <>. There are several techniques you can use to make your pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. AWS supports the BGP capabilities of Cilium and Calico for advertising hybrid nodes pod CIDRs, see <> for more information. From d006c28a5f93b059be4a424ceba09d94ec12d4af Mon Sep 17 00:00:00 2001 From: csplinter Date: Wed, 5 Mar 2025 14:47:10 -0600 Subject: [PATCH 3/4] use correct formats for code snippets --- latest/ug/nodes/hybrid-nodes-add-ons.adoc | 12 ++++++------ latest/ug/nodes/hybrid-nodes-cni.adoc | 4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/latest/ug/nodes/hybrid-nodes-add-ons.adoc b/latest/ug/nodes/hybrid-nodes-add-ons.adoc index a7e5a378..36811270 100644 --- a/latest/ug/nodes/hybrid-nodes-add-ons.adoc +++ b/latest/ug/nodes/hybrid-nodes-add-ons.adoc @@ -64,18 +64,18 @@ The sections that follow describe differences between running compatible {aws} a EKS installs kube-proxy and CoreDNS as self-managed add-ons by default when you create an EKS cluster with the {aws} API and {aws} SDKs, including from the {aws} CLI. You can overwrite these add-ons as Amazon EKS add-ons after cluster creation. Reference the EKS documentation for details on <> and <>. -If you are running a mixed mode cluster with both hybrid nodes and nodes in {aws} Cloud, we recommend that you have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in {aws} Cloud. CoreDNS can be configured such that your workloads will use the closest CoreDNS replica meaning your cloud workloads will use the CoreDNS running in the cloud and your hybrid workloads will use the CoreDNS running on hybrid nodes. See the steps below for how to configure CoreDNS for a mixed mode cluster. +If you are running a mixed mode cluster with both hybrid nodes and nodes in {aws} Cloud, we recommend that you have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in {aws} Cloud. CoreDNS can be configured such that your workloads will use the closest CoreDNS replica, meaning your cloud workloads will use the CoreDNS running in the cloud and your hybrid workloads will use the CoreDNS running on hybrid nodes. See the steps below for how to configure CoreDNS for a mixed mode cluster. -. Add a topology zone label for each of your hybrid nodes. This can alternatively be done at the `nodeadm init` phase. Note, cloud nodes automatically get a topology zone label applied to them. +. Add a topology zone label for each of your hybrid nodes, for example `topology.kubernetes.io/zone: onprem`. This can alternatively be done at the `nodeadm init` phase. Note, nodes running in {aws} Cloud automatically get a topology zone label applied to them. + -[source,yaml,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes"] ---- kubectl label node topology.kubernetes.io/zone= ---- + . Add `podAntiAffinity` to the CoreDNS deployment configuration for the topology zone key. You can alternatively configure the CoreDNS deployment during installation with EKS add-ons. + -[source,yaml,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes"] ---- kubectl edit deployment coredns -n kube-system ---- @@ -112,7 +112,7 @@ spec: + . Add `trafficDistribution` to the kube-dns Service configuration. + -[source,yaml,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes"] ---- kubectl edit service kube-dns -n kube-system ---- @@ -131,7 +131,7 @@ As the CloudWatch Observability agent runs https://kubernetes.io/docs/reference/ Node-level metrics are not available for hybrid nodes because link:AmazonCloudWatch/latest/monitoring/ContainerInsights.html[CloudWatch Container Insights,type="documentation"] depends on the availability of link:AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html[Instance Metadata Service,type="documentation"] (IMDS) for node-level metrics. Cluster, workload, pod, and container-level metrics are available for hybrid nodes. After installing the add-on by following the steps described in link:AmazonCloudWatch/latest/monitoring/install-CloudWatch-Observability-EKS-addon.html[Install the CloudWatch agent with the Amazon CloudWatch Observability,type="documentation"], the add-on manifest must be updated before the agent can run successfully on hybrid nodes. Edit the `amazoncloudwatchagents` resource on the cluster to add the `RUN_WITH_IRSA` environment variable as shown below. -[source,yaml,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes"] ---- kubectl edit amazoncloudwatchagents -n amazon-cloudwatch cloudwatch-agent ---- diff --git a/latest/ug/nodes/hybrid-nodes-cni.adoc b/latest/ug/nodes/hybrid-nodes-cni.adoc index 23769f9e..8e408c2e 100644 --- a/latest/ug/nodes/hybrid-nodes-cni.adoc +++ b/latest/ug/nodes/hybrid-nodes-cni.adoc @@ -68,7 +68,7 @@ Calico version `3.29.x` is supported and recommended for EKS Hybrid Nodes for ev - By default, Cilium is configured to run in overlay / tunnel mode with VXLAN as the link:https://docs.cilium.io/en/stable/network/concepts/routing/#encapsulation[encapsulation method]. This mode has the fewest requirements on the underlying physical network. - By default, Cilium link:https://docs.cilium.io/en/stable/network/concepts/masquerading/[masquerades] the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible to run Cilium with Amazon EKS clusters that have remote pod networks configured and with clusters that don't have remote pod networks configured. If you disable masquerading, then your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. - If you are running webhooks on your hybrid nodes, your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. If your pod CIDRs are not routable on your on-premises network, then it is recommended to run webhooks on cloud nodes in the same cluster. See <> for more information. -- A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Cilium, you must set `bgpControlPlane.enabled: true` in your Helm configuration. For more information on Cilium's BGP support, see https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane/[Cilium BGP Control Plane] in the Cilium documentation. +- A common way to make your pod CIDR routable on your on-premises network is to advertise pod addresses with BGP. To use BGP with Cilium, you must set `bgpControlPlane.enabled: true` in your Helm configuration. For more information on Cilium's BGP support, see https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane/[Cilium BGP Control Plane] in the Cilium documentation. - The default IP Address Management (IPAM) in Cilium is called link:https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/[Cluster Scope], where the Cilium operator allocates IP addresses for each node based on user-configured pod CIDRs. The pod CIDRs are configured with the `clusterPoolIPv4PodCIDRList` Helm value. Cilium allocates segments from the `clusterPoolIPv4PodCIDRList` to each node. The size of the per node segments is configured with the `clusterPoolIPv4MaskSize` Helm value. The `clusterPoolIPv4PodCIDRList` should match the remote pod network CIDRs you configured for your Amazon EKS cluster. For more information on the `clusterPoolIPv4PodCIDRList` and `clusterPoolIPv4MaskSize`, see https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool[Expanding the cluster pool] in the Cilium documentation. == Install Cilium on hybrid nodes @@ -355,7 +355,7 @@ kubectl get crds -oname | grep "cilium" | xargs kubectl delete - It is recommended to run Calico in overlay / tunnel mode with VXLAN as the link:https://docs.tigera.io/calico/latest/networking/configuring/vxlan-ipip[encapsulation method]. This mode has the fewest requirements on the underlying physical network. For more information on the different Calico networking modes, see https://docs.tigera.io/calico/latest/networking/determine-best-networking[Determining the best networking option] in the Calico documentation. - It is recommended to run Calico with `natOutgoing` set to `true`. With `natOutgoing` set to `true` the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible to run Calico with Amazon EKS clusters that have remote pod networks configured and with clusters that don't have remote pod networks configured. If you disable `natOutgoing`, then your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. - If you are running webhooks on your hybrid nodes, your pod CIDRs must be routable on your on-premises network and you must configure your Amazon EKS cluster with your remote pod networks. If your pod CIDRs are not routable on your on-premises network, then it is recommended to run webhooks on cloud nodes in the same cluster. See <> for more information. -- A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Calico, you must set `installation.calicoNetwork.bgp: Enabled` in your Helm configuration. For more information on Calico's BGP support, see link:https://docs.tigera.io/calico/latest/networking/configuring/bgp[Configure BGP peering] in the Calico documentation. +- - A common way to make your pod CIDR routable on your on-premises network is to advertise pod addresses with BGP. To use BGP with Calico, you must set `installation.calicoNetwork.bgp: Enabled` in your Helm configuration. For more information on Calico's BGP support, see link:https://docs.tigera.io/calico/latest/networking/configuring/bgp[Configure BGP peering] in the Calico documentation. - The default IP Address Management (IPAM) in Calico is called link:https://docs.tigera.io/calico/latest/networking/ipam/get-started-ip-addresses#calico-ipam[Calico IPAM], where the `calico-ipam` plugin allocates IP addresses for each node based on user-configured pod CIDRs. The pod CIDRs are configured with the `installation.calicoNetwork.ipPools.cidr` Helm value. Calico allocates segments from the `ipPools.cidr` to each node. The size of the per node segments is configured with the `ipPools.blockSize` Helm value. The `ipPools.cidr` should match the remote pod network CIDRs you configured for your Amazon EKS cluster. For more information on IPAM with Calico, see link:https://docs.tigera.io/calico/latest/networking/ipam/get-started-ip-addresses[Get started with IP address management] in the Calico documentation. == Install Calico on hybrid nodes From 454afb8b9ac0de3a77df795871dfa909c9555eba Mon Sep 17 00:00:00 2001 From: csplinter Date: Wed, 5 Mar 2025 14:58:17 -0600 Subject: [PATCH 4/4] more formatting changes for replaceable fields --- latest/ug/nodes/hybrid-nodes-add-ons.adoc | 6 +++--- latest/ug/nodes/hybrid-nodes-cni.adoc | 20 ++++++++++---------- latest/ug/nodes/hybrid-nodes-networking.adoc | 2 -- 3 files changed, 13 insertions(+), 15 deletions(-) diff --git a/latest/ug/nodes/hybrid-nodes-add-ons.adoc b/latest/ug/nodes/hybrid-nodes-add-ons.adoc index 36811270..0465210d 100644 --- a/latest/ug/nodes/hybrid-nodes-add-ons.adoc +++ b/latest/ug/nodes/hybrid-nodes-add-ons.adoc @@ -68,14 +68,14 @@ If you are running a mixed mode cluster with both hybrid nodes and nodes in {aws . Add a topology zone label for each of your hybrid nodes, for example `topology.kubernetes.io/zone: onprem`. This can alternatively be done at the `nodeadm init` phase. Note, nodes running in {aws} Cloud automatically get a topology zone label applied to them. + -[source,bash,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes,quotes"] ---- -kubectl label node topology.kubernetes.io/zone= +kubectl label node [.replaceable]`hybrid-node-name` topology.kubernetes.io/zone=[.replaceable]`zone` ---- + . Add `podAntiAffinity` to the CoreDNS deployment configuration for the topology zone key. You can alternatively configure the CoreDNS deployment during installation with EKS add-ons. + -[source,bash,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes,quotes"] ---- kubectl edit deployment coredns -n kube-system ---- diff --git a/latest/ug/nodes/hybrid-nodes-cni.adoc b/latest/ug/nodes/hybrid-nodes-cni.adoc index 8e408c2e..13883670 100644 --- a/latest/ug/nodes/hybrid-nodes-cni.adoc +++ b/latest/ug/nodes/hybrid-nodes-cni.adoc @@ -88,7 +88,7 @@ helm repo add cilium https://helm.cilium.io/ - You should not change your `clusterPoolIPv4PodCIDRList` or `clusterPoolIPv4MaskSize` after deploying Cilium on your cluster, see https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool[Expanding the cluster pool] in the Cilium documentation. - For a full list of Helm values for Cilium, see the https://docs.cilium.io/en/stable/helm-reference/[Helm reference] in the Cilium documentation. + -[source,bash,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes,quotes"] ---- affinity: nodeAffinity: @@ -161,7 +161,7 @@ mi-04a2cf999b7112233 Ready 19m v1.31.0-eks-a737599 . To use BGP with Cilium to advertise your pod addresses with your on-premises network, you must have installed Cilium with `bgpControlPlane.enabled: true`. To configure BGP in Cilium, first create a file called `cilium-bgp-cluster.yaml` with a `CiliumBGPClusterConfig` with the `peerAddress` set to your on-premises router IP that you are peering with. Configure the `localASN` and `peerASN` based on your on-premises router configuration. + -[source,yaml,subs="verbatim,attributes"] +[source,yaml,subs="verbatim,attributes,quotes"] ---- apiVersion: cilium.io/v2alpha1 kind: CiliumBGPClusterConfig @@ -176,11 +176,11 @@ spec: - hybrid bgpInstances: - name: "rack0" - localASN: ONPREM_ROUTER_ASN + localASN: [.replaceable]`ONPREM_ROUTER_ASN` peers: - name: "onprem-router" - peerASN: PEER_ASN - peerAddress: ONPREM_ROUTER_IP + peerASN: [.replaceable]`PEER_ASN` + peerAddress: [.replaceable]`ONPREM_ROUTER_IP` peerConfigRef: name: "cilium-peer" ---- @@ -314,11 +314,11 @@ helm uninstall cilium-preflight --namespace kube-system + Before running the helm upgrade command, preserve the values for your deployment in a `cilium-values.yaml` or use `--set` command line options for your settings. The upgrade operation overwrites the Cilium ConfigMap, so it is critical that your configuration values are passed when you upgrade. If you are using BGP, it is recommended to use the `--set bgpControlPlane=true` command line option instead of supplying this information in your values file. + -[source,bash,subs="verbatim,attributes"] +[source,bash,subs="verbatim,attributes,quotes"] ---- -helm upgrade cilium cilium/cilium --version CILIUM_VERSION \ +helm upgrade cilium cilium/cilium --version [.replaceable]`CILIUM_VERSION` \ --namespace kube-system \ - --set upgradeCompatibility=1.X \ + --set upgradeCompatibility=[.replaceable]`1.X` \ -f cilium-values.yaml ---- @@ -373,7 +373,7 @@ helm repo add projectcalico https://docs.tigera.io/calico/charts - Replace `CIDR_SIZE` with the size of the CIDR segment you want to allocate to each node. For example, `25` for a /25 segment size of 128 pod addresses per node. For more information on CIDR `blockSize` and changing the `blockSize`, see https://docs.tigera.io/calico/latest/networking/ipam/change-block-size[Change IP pool block size] in the Calico documentation. - In the example below, `natOutgoing` is enabled and `bgp` is disabled. Modify these values based on your target configuration. + -[source,yaml,subs="verbatim,attributes"] +[source,yaml,subs="verbatim,attributes,quotes"] ---- installation: enabled: true @@ -525,7 +525,7 @@ kubectl apply --server-side --force-conflicts \ -f https://raw.githubusercontent.com/projectcalico/calico/[.replaceable]`CALICO_VERSION`/manifests/operator-crds.yaml ---- -. Run `helm upgrade` to upgrade your Calico deployment. Replace CALICO_VERSION with the version you are upgrading to, for example `v3.29.0`. Create the `calico-values.yaml` file from the configuration values that you used to install Calico. +. Run `helm upgrade` to upgrade your Calico deployment. Replace `CALICO_VERSION` with the version you are upgrading to, for example `v3.29.0`. Create the `calico-values.yaml` file from the configuration values that you used to install Calico. + [source,bash,subs="verbatim,attributes,quotes"] ---- diff --git a/latest/ug/nodes/hybrid-nodes-networking.adoc b/latest/ug/nodes/hybrid-nodes-networking.adoc index e6a0caaf..d3c362ce 100644 --- a/latest/ug/nodes/hybrid-nodes-networking.adoc +++ b/latest/ug/nodes/hybrid-nodes-networking.adoc @@ -44,8 +44,6 @@ Mixed mode clusters are defined as EKS clusters that have both hybrid and cloud If your applications require pods running on cloud nodes to directly communicate with pods running on hybrid nodes ("east-west communication"), then your pod CIDR must be routable on your on-premises network when using the VPC CNI on cloud nodes and Cilium or Calico in overlay/tunnel mode on hybrid nodes. In this setup, the pods in the cloud get IP addresses from the VPC CNI based on the VPC subnet configuration and communicate over the VPC network. If your on-premises pod CIDR range is configured in your VPC routing table with a route to your gateway (TGW or VGW), then the traffic destined for pods on hybrid nodes will be routed to the gateway and on to your on-premises network. When the traffic reaches your on-premises network, your router must know which hybrid node owns the IP address space of the destination pod, which is why your pod CIDR must be routable on your on-premises network for this communication path to work. -For additional information on running mixed mode clusters and the networking options available, see the link:https://docs.aws.amazon.com/eks/latest/best-practices/hybrid.html[EKS Best Practices Guide for hybrid deployments]. - *Access required during hybrid node installation and upgrade* You must have access to the following domains during the installation process where you install the hybrid nodes dependencies on your hosts. This process can be done once when you are building your operating system images or it can be done on each host at runtime. This includes initial installation and when you upgrade the Kubernetes version of your hybrid nodes.