Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(prometheus): Switch to upstream prometheus as the default Prometheus image #166

Conversation

jsirianni
Copy link
Member

@jsirianni jsirianni commented Sep 19, 2024

Description of Changes

Some users have strict security requirements that require auditing of unapproved images, such as ghcr.io/observiq/bindplane-prometheus. This is a source of friction when prom/prometheus is already approved for use.

This PR replaces bindplane-prometheus with prom/prometheus. The end result is identical, as the bindplane-prometheus image is just a wrapper around prom/prometheus, with our configurations baked in.

The following changes were required

  • Addition of configmap for injecting the three configuration files via volume mounts
  • Entrypoint command args

This change will be transparent to the user. The latest and past few BindPlane releases used Prometheus v2.54.1, which is now the default prometheus.image.tag value. The configuration and entrypoint command remain unchanged.

Testing

When the image is deployed, it has the following files in /etc/prometheus.

/prometheus $ ls -la /etc/prometheus
total 24
drwxr-xr-x    1 nobody   nobody        4096 Sep 19 16:35 .
drwxr-xr-x    1 root     root          4096 Sep 19 16:35 ..
lrwxrwxrwx    1 nobody   nobody          39 Aug 27 11:20 console_libraries -> /usr/share/prometheus/console_libraries
lrwxrwxrwx    1 nobody   nobody          31 Aug 27 11:20 consoles -> /usr/share/prometheus/consoles/
-rw-r--r--    1 root     nobody          59 Sep 19 16:35 prometheus.yml
-rw-r--r--    1 root     nobody         629 Sep 19 16:35 rules.yml
-rw-r--r--    1 root     nobody         172 Sep 19 16:35 web.yml

If you compare this to bindplane-prometheus, it is similar.

/prometheus $ ls -la /etc/prometheus
total 20
drwxr-xr-x    1 nobody   nobody        4096 Aug 28 13:21 .
drwxr-xr-x    1 root     root          4096 Sep 19 16:45 ..
lrwxrwxrwx    1 nobody   nobody          39 Aug  9 12:03 console_libraries -> /usr/share/prometheus/console_libraries
lrwxrwxrwx    1 nobody   nobody          31 Aug  9 12:03 consoles -> /usr/share/prometheus/consoles/
-rw-------    1 nobody   nobody          59 Aug 28 13:21 prometheus.yml
-rw-------    1 nobody   nobody         629 Aug 28 13:21 rules.yml
-rw-------    1 nobody   nobody         172 Aug 28 13:21 web.yml

The file permissions are different, but that is okay. In this case, 0644 owned by root is sufficient for the Prometheus process to read the config files. We had more control over this before when we built our own images. We have less flexibility here when mounting config files into containers using volume mounts.

Default / StatefulSet

I tested by deploying the "default" configuration.

# values.yaml
config:
  username: bpuser
  password: bppass
  sessions_secret: 4484766F-5016-4077-B8E0-0DE1D637854B
helm upgrade \
  --install bindplane \
  ./charts/bindplane \
  --values values.yaml
NAME                                         READY   STATUS    RESTARTS   AGE
bindplane-0                                  1/1     Running   0          17m
bindplane-prometheus-0                       1/1     Running   0          94s
bindplane-transform-agent-84d4bf89f7-b527n   1/1     Running   0          17m

After configuring a node agent, measurements begin flowing in.

Screenshot from 2024-09-19 12-43-40

High Availability w/ Upgrade and Downgrade

I tested HA with NATS + Postgres just to be certain.

k apply -f test/helper/postgres/postgres.yaml 
k create secret generic bindplane --from-literal=license=$BINDPLANE_LICENSE       

# Requires a /etc/hosts entry and `minikube tunnel`
minikube addons enable ingress 
config:
  username: bpuser
  password: bppass
  sessions_secret: 4484766F-5016-4077-B8E0-0DE1D637854B
  server_url: http://bindplane.local:80
  # The secret "bindplane" should exist and have the
  # key license with a license key as the value.
  # License is required in CI in order to use Postgres store.
  licenseUseSecret: true

ingress:
  enable: true
  host: bindplane.local
  class: nginx

backend:
  type: postgres
  postgres:
    host: postgres.postgres.svc.cluster.local
    database: bindplane
    username: postgres
    password: password
    maxConnections: 20
    
eventbus:
  type: nats

replicas: 1

resources:
  requests:
    memory: 100Mi
    cpu: 100m
  limits:
    memory: 100Mi

nats:
  resources:
    requests:
      memory: 100Mi
      cpu: 100m
    limits:
      memory: 100Mi
helm upgrade \
  --install bindplane-ha \
  ./charts/bindplane \
  --values values.yaml
NAME                                            READY   STATUS    RESTARTS   AGE
bindplane-ha-77b886bf96-nffnf                   1/1     Running   0          52s
bindplane-ha-jobs-6fdbf56784-snwxz              1/1     Running   0          7s
bindplane-ha-nats-0                             1/1     Running   0          2m17s
bindplane-ha-nats-1                             1/1     Running   0          2m17s
bindplane-ha-nats-2                             1/1     Running   0          2m17s
bindplane-ha-prometheus-0                       1/1     Running   0          2m17s
bindplane-ha-transform-agent-777cb688db-cfvdb   1/1     Running   0          2m17s

I can switch back and forth between Charts bindplane/bindplane (latest release) and ./charts/bindplane (this branch) without issue. Simulating upgrades and downgrades.

Topology, Agents page, and agent health all look good.

Please check that the PR fulfills these requirements

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • CI passes
  • Changes to ports, services, or other networking have been tested with istio

Comment on lines -40 to -42
{{- define "prometheus.tag" -}}
{{- printf "%s" (default (printf "%s" .Chart.AppVersion) .Values.prometheus.image.tag) }}
{{- end -}}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used this helper to set the Prometheus image based on the Chartfile's application version (BindPlane's version). This is no longer nessisary because the values file has a default value for prometheus.image.tag.

@@ -1,5 +1,35 @@
{{- if not .Values.prometheus.remote }}
{{- if not .Values.prometheus.enableSideCar }}
apiVersion: v1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These configurations match our specification, which can be found here or within the BindPlane repository under package/prometheus.

@@ -50,7 +80,16 @@ spec:
{{- end }}
containers:
- name: prometheus
image: {{ .Values.prometheus.image.name }}:{{ include "prometheus.tag" . }}
image: {{ .Values.prometheus.image.name }}:{{ .Values.prometheus.image.tag }}
args:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These arguments match the args defined here.

@jsirianni jsirianni marked this pull request as ready for review September 19, 2024 17:10
@jsirianni jsirianni requested a review from tbm48813 as a code owner September 19, 2024 17:10
@jsirianni jsirianni requested review from antonblock and removed request for tbm48813 September 19, 2024 17:10
@jsirianni jsirianni changed the title feat(prometheus): Switch to upstream prometheus as the default PRometheus image feat(prometheus): Switch to upstream prometheus as the default Prometheus image Sep 19, 2024
@jsirianni jsirianni merged commit fe798fb into main Sep 20, 2024
19 checks passed
@jsirianni jsirianni deleted the joesirianni/bpop-799-bindplane-helm-use-upstream-prometheus-image branch September 20, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants