Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnsupportedAddonModification When Adding Prometheus Configuration to amazon-cloudwatch-observability Addon #173

Open
viram99 opened this issue Feb 10, 2025 · 2 comments

Comments

@viram99
Copy link

viram99 commented Feb 10, 2025

Description:
I'm encountering an issue while attempting to configure the amazon-cloudwatch-observability addon for EKS using Terraform. The configuration applies successfully without the Prometheus section. However, when I add Prometheus support, the addon update fails with the following error:

waiting for EKS Add-On (dev1:amazon-cloudwatch-observability) update (0735ecff-d911-33f8-a1f9-8d6d3b18b841): unexpected state 'Failed', wanted target 'Successful'. last error: : UnsupportedAddonModification: Amazon EKS was unable to complete the addon operation. The configuration values provided are invalid.

Terraform configuration

resource "aws_eks_addon" "amazon-cloudwatch-observability" {
  cluster_name       = var.name
  addon_name         = "amazon-cloudwatch-observability"
  addon_version      = "v3.0.0-eksbuild.1"
  resolve_conflicts_on_update = "OVERWRITE"  # Ensure conflicts are resolved by overwriting existing configs

  configuration_values = jsonencode({
    admissionWebhooks = {
      certManager = {
        enabled = true
      }
    }
    agent = {
      config = {
        logs = {
          metrics_collected = {
            emf = {}
            kubernetes = {
              accelerated_compute_metrics = false
              enhanced_container_insights = true
            }
          }
          force_flush_interval = 5
        }
      }
      prometheus = {
        config = jsonencode({
          "global" = {
            "scrape_interval" = "10s"
            "scrape_timeout"  = "10s"
          }
        })
      }
    }
    containerLogs = {
      enabled = false
    }
  })

  depends_on = [helm_release.cert_manager]
}

Steps to Reproduce:

Apply the above Terraform configuration with the Prometheus block included.
Observe the failure message related to UnsupportedAddonModification.
Expected Behavior:
The addon should update successfully with the Prometheus configuration.

Actual Behavior:
The addon fails to update, citing invalid configuration values.

Additional Information:

Removing the prometheus block resolves the error, but Prometheus support is essential for my use case.
I'm following this configuration example from the official Helm chart repository.

@sky333999
Copy link
Contributor

Hi @viram99 , would you mind trying this out with the latest version of the EKS add-on, i.e. >= v3.3.0-eksbuild.1 and let us know if its still an issue.
We had a recent bug fix to read the prometheus config correctly, so it likely is related to that.

@viram99
Copy link
Author

viram99 commented Feb 12, 2025

Hi @sky333999, Thank you for your help.

We successfully deployed the Amazon CloudWatch Observability EKS Add-On using the following configuration.

resource "aws_eks_addon" "amazon-cloudwatch-observability-containerinsights" {
  cluster_name  = var.name
  addon_name    = "amazon-cloudwatch-observability"
  addon_version = "v3.3.0-eksbuild.1"
  configuration_values = jsonencode({
    admissionWebhooks = {
      certManager = {
        enabled = true
      }
    }
    agent = {
      config = {
        logs = {
          metrics_collected = {
            emf = {
            }
            kubernetes = {
              accelerated_compute_metrics = false
              enhanced_container_insights = true
            }
            prometheus = {
                prometheus_config_path = "/etc/prometheusconfig/prometheus.yaml"
                emf_processor = {
                  # metric_namespace = "ContainerInsights/HamiltonSkills"
                  metric_declaration = [
                    {
                      source_labels = ["service"]
                      label_matcher = ".*hamilton-skills.*"
                      dimensions    = [["service", "ClusterName", "Namespace"]]
                      metric_selectors = [
                        "^chat_tokens_per_minute$",
                        "^chat_requests_per_minute$",
                        "^chat_tokens_total$"
                      ]
                    }
                  ]
                }
              }
          }
          force_flush_interval = 5
        }
      }
      prometheus = {
        config = {
          global = {
            scrape_interval = "60s"
            scrape_timeout  = "10s"
          }
          scrape_configs = [
              {
                job_name = "hamilton"
                static_configs = [{
                  targets = ["hamilton-skills.hamilton-skills.svc.cluster.local:8080"]
                }]
                kubernetes_sd_configs = [{
                  role = "pod"
                }]
                relabel_configs = [
                  {
                    source_labels = ["__meta_kubernetes_namespace"]
                    action        = "keep"
                    regex         = "hamilton-skills"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_namespace"]
                    target_label  = "Namespace"
                  },
                  {
                    source_labels = ["__meta_kubernetes_pod_name"]
                    action        = "replace"
                    target_label  = "pod_name"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_pod_container_name"]
                    target_label  = "container_name"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_pod_controller_name"]
                    target_label  = "pod_controller_name"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_pod_controller_kind"]
                    target_label  = "pod_controller_kind"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_pod_phase"]
                    target_label  = "pod_phase"
                  },
                  {
                    action        = "replace"
                    source_labels = ["__meta_kubernetes_pod_node_name"]
                    target_label  = "NodeName"
                  }
                ]
              }
            ]
        }
      }
    }
    containerLogs = {
      enabled = false
    }
  })
}

We’ve encountered an issue when configuring the amazon-cloudwatch-observability EKS Add-On when both prometheus and kubernetes blocks are defined under metrics_collected, custom metrics scraped by Prometheus are not ingested into CloudWatch.

Issue Details:
When both prometheus and kubernetes configurations are present:

  • Custom Prometheus metrics are not scraped or ingested into CloudWatch.
  • Removing the kubernetes block allows Prometheus metrics to be ingested correctly.

We suspect this is due to the mutual exclusivity of the prometheus and kubernetes options as hinted in the CloudWatch Agent source code.

Request:

  • Clarification: Is it by design that both prometheus and kubernetes configurations cannot coexist for metric collection?

Desired Outcome:

We need both Container Insights (via the kubernetes block) and custom Prometheus metrics to be collected and ingested into CloudWatch at the same time. This is crucial for comprehensive monitoring of both system-level and application-specific metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants