Skip to content

Commit

Permalink
feat: [PAYMCLOUD-207] Update node pool size to min10 and max11 (#2736)
Browse files Browse the repository at this point in the history
Increase node pool size and update Kubernetes event filters.

Adjusted minimum and maximum node counts in the WEU production environment to handle increased workloads. Added new Kubernetes event reasons ("ContainerdStart" and "NodeNotReady") to Opsgenie alert filters for improved monitoring of critical and warning events.

Signed-off-by: Fabio Felici <[email protected]>
  • Loading branch information
ffppa authored Jan 21, 2025
1 parent 9d53baa commit 3f25ced
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ config:
- reason: "FreezeScheduled"
- reason: "TerminateScheduled"
- reason: "PreemptScheduled"
- reason: "ContainerdStart"
- reason: "NodeNotReady"
match:
- receiver: "${opsgenie_receiver_name}-critical"
type: "Warning"
Expand All @@ -68,6 +70,7 @@ config:
- reason: "FreezeScheduled"
- reason: "TerminateScheduled"
- reason: "PreemptScheduled"
- reason: "ContainerdStart"
match:
- receiver: ${opsgenie_receiver_name}-warning
reason: "OOMKilling"
Expand All @@ -81,4 +84,6 @@ config:
reason: "TerminateScheduled"
- receiver: ${opsgenie_receiver_name}-warning
reason: "PreemptScheduled"
- receiver: ${opsgenie_receiver_name}-warning
reason: "NodeNotReady"
%{ endif }
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,15 @@ config:
- reason: "FreezeScheduled"
- reason: "TerminateScheduled"
- reason: "PreemptScheduled"
- reason: "ContainerdStart"
- reason: "NodeNotReady"
match:
- receiver: "${opsgenie_receiver_name}-critical"
type: "Warning"
- receiver: "${opsgenie_receiver_name}-critical"
reason: "Failed"
- receiver: "${opsgenie_receiver_name}-critical"
reason: "NotTriggerScaleUp"

- drop:
- reason: "Unhealthy"
- kind: "HorizontalPodAutoscaler"
Expand All @@ -69,6 +70,7 @@ config:
- reason: "FreezeScheduled"
- reason: "TerminateScheduled"
- reason: "PreemptScheduled"
- reason: "ContainerdStart"
match:
- receiver: ${opsgenie_receiver_name}-warning
reason: "OOMKilling"
Expand All @@ -82,4 +84,6 @@ config:
reason: "TerminateScheduled"
- receiver: ${opsgenie_receiver_name}-warning
reason: "PreemptScheduled"
- receiver: ${opsgenie_receiver_name}-warning
reason: "NodeNotReady"
%{ endif }
4 changes: 2 additions & 2 deletions src/aks-platform/env/weu-prod/terraform.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ aks_user_node_pool = {
vm_size = "Standard_D8ds_v5"
os_disk_type = "Ephemeral"
os_disk_size_gb = "300"
node_count_min = "8"
node_count_max = "10"
node_count_min = "10"
node_count_max = "11"
node_labels = { node_name : "aks-user-01", node_type : "user" },
node_taints = [],
node_tags = { node_tag_1 : "1" },
Expand Down

0 comments on commit 3f25ced

Please sign in to comment.