(module name): (short issue description) #1053

NicoleY666 · 2024-08-03T12:01:14Z

Describe the bug

Here is the addon deployment:

const region = "ap-southeast-2"
const karpenterAddOn = new blueprints.addons.KarpenterAddOn({
version: '0.35.5',
nodePoolSpec: {
requirements: [
{ key: 'node.kubernetes.io/instance-type', operator: 'In', values: ["c5d.4xlarge","c6a.2xlarge","c6a.4xlarge","c6a.8xlarge","c6a.16xlarge"] },
{ key: 'topology.kubernetes.io/zone', operator: 'In', values: [${region}a, ${region}b, ${region}c] },
{ key: 'kubernetes.io/arch', operator: 'In', values: ['amd64','arm64']},
{ key: 'karpenter.sh/capacity-type', operator: 'In', values: ['on-demand']},
],
disruption: {
consolidationPolicy: "WhenEmpty",
consolidateAfter: "30s",
expireAfter: "72h",
budgets: [{nodes: "10%"}]
}
},
ec2NodeClassSpec: {
amiFamily: "AL2",
subnetSelectorTerms: [{ tags: {"ops:repo":xxxx} }],
securityGroupSelectorTerms: [{ tags: {"aws:eks:cluster-name": 'xxxxx'} }],
},
interruptionHandling: true,
podIdentity: false,
});

const addOns: Array<blueprints.ClusterAddOn> = [
  new blueprints.addons.CalicoOperatorAddOn(),
  new blueprints.addons.MetricsServerAddOn(),
  new blueprints.addons.AwsLoadBalancerControllerAddOn({
    enableWaf: false,
    version: mapping[env].helmChartVersion,
  }),
  new blueprints.addons.VpcCniAddOn(),
  new blueprints.addons.CoreDnsAddOn(),
  new blueprints.addons.KubeProxyAddOn(),
  new blueprints.addons.SSMAgentAddOn(),
  new blueprints.addons.CloudWatchInsights(),
  karpenterAddOn
];

Attach with the new node without deployments

Expected Behavior

I want the karpenter scale up and down automatically as a smart autoscaler

Current Behavior

There has three node at the same time
NAME↑ STATUS ROLE TAINTS VERSION PODS CPU MEM %CPU %MEM CPU/A MEM/A AGE
ip-10-60-63-193.ap-southeast-2.compute.internal Ready 0 v1.29.3-eks-ae9a62a 14 113 1866 0 6 15890 28360 32h
ip-10-60-74-255.ap-southeast-2.compute.internal Ready 0 v1.29.3-eks-ae9a62a 16 130 1755 0 6 15890 28360 31h
ip-10-60-95-42.ap-southeast-2.compute.internal Ready 3 v1.29.6-eks-1552ad0 9 0 0 0 0 7910 14640 8h

The third node is not scale down as expected, even thought there has no new deployment. when I checked the pod in the third node. there has a calico-system calico-typha-988d6c9c5-fh55r (which is not a daemonset), which is blocking the karpenter scale down the node. but this pod is deployed by CalicoOperatorAddOn().
which create three pods
calico-system calico-node-jnx6c. (daemonsets)
calico-system calico-typha-988d6c9c5-fh55r (deployment)
calico-system csi-node-driver-ld48d (daemonsets)

As the calico-typha is created by addon, i don't know how to make the karpenter scale down as expected.

Reproduction Steps

the code is deployed above.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.147.3

EKS Blueprints Version

1.15.1

Node.js Version

20

Environment details (OS name and version, etc.)

EKS

Other information

No response

The text was updated successfully, but these errors were encountered:

shapirov103 · 2024-08-06T15:36:18Z

@NicoleY666

CalicoOperator addon deploys the operator only. That component then deploys the pods that you mentioned. This functionality is controlled by calico.

Let me understand the issue: you have a node with calico pods running. Your screenshots show the pods running on ip-10-60-95-42. Is that the node that you want to scale down or are there any nodes with no pods which are not scaled down?

Calico CNI is not a component that we support functionally. While we support provisioning of that component (operator), the actual software is maintained by the Calico community (or Tigera for enterprise support). In general, CNI components are considered to be mission critical and may have specific disruption rules applied.

NicoleY666 · 2024-08-07T05:29:55Z

Yes. You are correct, but the pod is blocking the node scale down as the node is not empty (except daemonset), is there a way that we can except some deployment and make it scale down? As the policy is whenempty (so if only daemonset pod running in the node, then the node can be scale down. However, the pod is created as deployment which is control by calicoOperator addon, I cant make it as daemonset. As the pod calico-typha in the node, it can't mark it as empty to scale down. If you can see the screenshot attach, it shouldn't have 4 node with less than 20% usage.

but it also have the node with daemonset only not scale down as expected, not quite sure why

here is the daemonset screenshot

NicoleY666 added the bug Something isn't working label Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(module name): (short issue description) #1053

(module name): (short issue description) #1053

NicoleY666 commented Aug 3, 2024 •

edited

Loading

shapirov103 commented Aug 6, 2024

NicoleY666 commented Aug 7, 2024

(module name): (short issue description) #1053

(module name): (short issue description) #1053

Comments

NicoleY666 commented Aug 3, 2024 • edited Loading

Describe the bug

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

CDK CLI Version

EKS Blueprints Version

Node.js Version

Environment details (OS name and version, etc.)

Other information

shapirov103 commented Aug 6, 2024

NicoleY666 commented Aug 7, 2024

NicoleY666 commented Aug 3, 2024 •

edited

Loading