Target down alerts for error pods #2136
Unanswered
raghuafero
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
We have installed prometheus in our GKE cluster using kube-prometheus-operator. What we. are observing is a periodic "TargetDown" Alerts (eg: 11.1% of ... targets in default namespace are down.). Upon checking the kubectl commands, we saw that prometheus was trying to scrape pods in "Error" state and these alerts are related to those pods. The Error state pods were due to scale up/down events for node pools. I looked at this issue in github (prometheus/prometheus#10755) and put the solution in the servicemonitor (kube-state-metrics), but that doesn't seem to be helping. Anybody here faced this issue and found a solution?
Please see the source_label "__meta_kubernetes_pod_phase" section below from prometheus.yaml.
Thanks
honor_labels: true
kubernetes_sd_configs:
namespaces:
names:
relabel_configs:
<<<<<<<>>>>>>>>>>
replacement: http
**- source_labels:
regex: Error
action: drop**
Beta Was this translation helpful? Give feedback.
All reactions