HA prometheus and altermanager possible? #1088

plevart · 2021-04-12T10:54:34Z

plevart
Apr 12, 2021

Hi,
As I understand, currently prometheus and alertmanager instance(s) which need persistent storage are deployed as single-instance StatefulSet(s) that create ReadWriteOnce volume claims for their storage needs. The resulting PODs are inherently bound to the cluster node to which they are 1st scheduled. But this is not HA setup. In the event that this node goes down, prometheus and/or altert manager are not available any more.
If one has a volume provisioner available that provisions ReadWriteMany type of volumes, then an alternative way to deploy prometheus and altertmanager would be to create single-instance Deployment(s) which would then be HA. Would that be something this project is interested to provide as an alternative "mode" of Helm chart?

paulfantom · 2021-04-12T11:31:35Z

paulfantom
Apr 12, 2021
Maintainer

As I understand, currently prometheus and alertmanager instance(s) which need persistent storage are deployed as single-instance StatefulSet(s) that create ReadWriteOnce volume claims for their storage needs.

In default installation, prometheus is deployed with 2 replicas and alertmanager in 3 replicas. Storage is up to the user and we don't specify if this is ReadWriteOnce or ReadWriteMany.

The resulting PODs are inherently bound to the cluster node to which they are 1st scheduled.

That depends on the storage provider and it is out of scope for prometheus-operator. We define volume claim option and storage provider is doing the rest.

In the event that this node goes down, prometheus and/or altert manager are not available any more.

You should use node anti-affinity (provided by an addon in https://github.com/prometheus-operator/kube-prometheus/blob/main/jsonnet/kube-prometheus/addons/anti-affinity.libsonnet) to ensure pods are not landing on the same node and thus node cannot bring down whole monitoring.

an alternative way to deploy prometheus and altertmanager would be to create single-instance Deployment(s)

Prometheus and alertmanager instances are managed by prometheus-operator and StatefuleSet creation is embedded in the logic of the operator itself. There is no easy way to change it.

Would that be something this project is interested to provide as an alternative "mode" of Helm chart?

This project doesn't provide helm chart.

4 replies

plevart Apr 12, 2021
Author

Ah, sorry. The helm chart here: https://artifacthub.io/packages/helm/prometheus-community/kube-prometheus-stack has a reference to this Github project at the very beginning. I didn't realize this was not the project maintaining that chart...
Anyway, thanks for your answers... You say that it is possible to scale-up the number of pods in the prometheus StatefulSet (by modifying the replicas in the Prometheuses custom resource). Together with node anti-affinity this can bring up multiple prometheus pods on different nodes. If each pod mounts its own filesystem for persistence, is prometheus able to act like a cluster with replication of peristent state among instances (similar to etcd)?

paulfantom Apr 12, 2021
Maintainer

is prometheus able to act like a cluster with replication of peristent state among instances

No, as prometheus itself doesn't have a concept of HA or a cluster and prometheus cannot share the same volume with another prometheus as it will lead to data corruption. In prometheus world you ensure HA by having each prometheus instance scrape the same targets and deduplicating data on the read path. This can be done by a simple 2-replica setup with thanos-querier in front. And if scaling is needed for performance reasons, prometheus-operator has a way to use sharding.

You can take a look at OpenShift cluster-monitoring-operator project to check plain yaml manifests for HA prometheus setup https://github.com/openshift/cluster-monitoring-operator/tree/master/assets or at my homelab: https://github.com/thaum-xyz/ankhmorpork/tree/master/apps/monitoring/manifests :)

plevart Apr 12, 2021
Author

I also searched for info and found the following:
https://prometheus.io/docs/introduction/faq/#can-prometheus-be-made-highly-available
...so If I understand correctly, each "consumer" of prometheus data written in two prometheus instances then has to do the deduplication. Alert manager does it itself and if you also want Grafana to do it, you have to put thanos-querier in front of the instances...

OTOH, Alert manager does have the HA mode, right?

paulfantom Apr 12, 2021
Maintainer

Alert manager does it itself

Sort of :) Alertmanager doesn't query data from prometheus so it doesn't do reads. Alerts are generated on prometheus side and sent to alertmanager for routing and deduplication (among other things)

The alertmanager name was chosen poorly, it should be something like alert router :)

Alert manager does have the HA mode

Yes, you can say that :)

A bit more involved answer would be that since prometheus is constantly sending alerts there is no need for alertmanager to store the alert state as deduplication and routing can be done on the fly. However, alertmanager has a concept of silences that need to be persisted for some time (stored in memory) and potentially distributed in a cluster and for this alertmanager uses gossip protocol.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HA prometheus and altermanager possible? #1088

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

HA prometheus and altermanager possible? #1088

plevart Apr 12, 2021

Replies: 1 comment · 4 replies

paulfantom Apr 12, 2021 Maintainer

plevart Apr 12, 2021 Author

paulfantom Apr 12, 2021 Maintainer

plevart Apr 12, 2021 Author

paulfantom Apr 12, 2021 Maintainer

plevart
Apr 12, 2021

Replies: 1 comment 4 replies

paulfantom
Apr 12, 2021
Maintainer

plevart Apr 12, 2021
Author

paulfantom Apr 12, 2021
Maintainer

plevart Apr 12, 2021
Author

paulfantom Apr 12, 2021
Maintainer