permalink |
---|
/guides/app-monitoring/ |
For application monitoring on OpenShift, you need to set up your own Prometheus and Grafana deployments. There are two approaches for setting up Prometheus on OpenShift.
-
The first approach is via Prometheus Operator and Service Monitor, which is the newest and the most popular way of setting up Prometheus on a Kubernetes system.
-
Use the legacy way of deploying Prometheus on OpenShift without the Prometheus Operator.
This guide explores both approaches to set up Prometheus on Openshift.
Prior to deploying Prometheus, ensure that there is a running application that has a service endpoint for outputting metrics in Prometheus format.
It is assumed such a running application has been deployed to the OpenShift cluster inside a project/namespace called myapp
, and that the Prometheus metrics endpoint is exposed on path /metrics
.
The Prometheus Operator is an open-source project originating from CoreOS and exists as as part of their Kubernetes Operator offering. The Kubernetes Operator framework is becoming the standard for Prometheus deployments on a Kubernetes system. When the Prometheus Operator is installed on the Kubernetes system, you no longer need to hand-configure the Prometheus configuration. Instead, you create ServiceMonitor resources for each of the service endpoints that needs to be monitored: this makes daily maintainenance of the Prometheus server a lot easier. An architecture overview of the Prometheus Operator is shown below:
There are two ways to install the Prometheus Operator.
-
One is through Openshift Operator Lifecycle Manager or OLM, (which is still in its technology preview phase in release 3.11 of Openshift). This approach installs an older version of the Prometheus Operator that is supported by Red Hat.
-
Another approach is to install Prometheus Operator by following the guide from the Prometheus Operator git repository. Since OLM is still at its technical preview stage and requires a Red Hat subscription account for installation, this guide showsthe installation without OLM to target a larger audience base. The guide will be updated with the OLM approach when Kabanero officially adopts Openshift 4.x.
The following procedure is based on the Prometheus Getting Started guide maintained by the CoreOS team, with the added inclusion of Openshift commands needed to complete each step.
-
Clone the Prometheus Operator repository
[root@rhel7-openshift ~]# git clone https://github.com/coreos/prometheus-operator
-
Open the
bundle.yaml
file and change all instances of namespace: default to the namespace where you want to deploy the Prometheus Operator. For example, if the namespace is namespace: prometheus-operator, you might need to create a new namespace called prometheus-operator if it does not exit on your Openshift cluster. -
Save the
bundle.yaml
file and deploy the Prometheus Operator using the following command.
[root@rhel7-openshift ~]# oc apply -f bundle.yaml
You can receive an error message like the one below when running the command.
Error creating: pods "prometheus-operator-5b8bfd696-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 65534: must be in the ranges: [1000070000, 1000079999]]
To correct the error, change the runAsUser: 65534 field in the bundle.yaml
file to a valid value that is in the range specified in the error message. In this case, setting runAsUser: 1000070000 in the bundle.yaml
would be in the valid range. Save the bundle.yaml
file and re-deploy the Prometheus Operator.
[root@rhel7-openshift ~]# oc delete -f bundle.yaml [root@rhel7-openshift ~]# oc apply -f bundle.yaml
service_monitor.yaml
link:code/service_monitor.yaml[role=include]
prometheus.yaml
link:code/prometheus.yaml[role=include]
prometheus_snippet.yaml
link:code/prometheus_snippet.yaml[role=include]
The service_monitor.yaml
file defines a ServiceMonitor resource. A ServiceMonitor defines a service endpoint that needs to be monitored by the Prometheus instance. Take for example, an application with label app: myapp from namespace myapp, and metrics endpoints defined in spec.endpoints to be monitored by the Promtheus Operator. If the metrics endpoint is secured, you can define a secured endpoint with authentication configuration by following the endpoint API documentation of Prometheus Operator.
Create the service_monitor.yaml
file
-
Apply the
servicemonitor.yaml
file to create the ServiceMonitor resource.
[root@rhel7-openshift ~]# oc apply -f service_monitor.yaml
-
Define a Prometheus resource that can scrape the targets defined in the ServiceMonitor resource. Create a
prometheus.yaml
file that aggregates all the files from the git repository directoryprometheus-operator/example/rbac/prometheus/
. NOTE: Make sure to change thenamespace: default
tonamespace: prometheus-operator
.
Create the prometheus.yaml
file
-
Apply the
prometheus.yaml
file to deploy the Prometheus service. After all the resources are created, apply the Prometheus Operatorbundle.yaml
file again.
[root@rhel7-openshift ~]# oc apply -f prometheus.yaml [root@rhel7-openshift ~]# oc apply -f bundle.yaml
-
Verify that the Prometheus services have successfully started. The prometheus-operated service is created automatically by the prometheus-operator, and is used for registering all deployed prometheus instances.
[root@rhel-2EFK ~]# oc get svc -n prometheus-operator
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus NodePort 172.30.112.199 <none> 9090:30342/TCP 19h
prometheus-operated ClusterIP None <none> 9090/TCP 19h
prometheus-operator ClusterIP None <none> 8080/TCP 21h
-
Expose the prometheus-operated service to use the Prometheus console externally.
[root@rhel-2EFK]# oc expose svc/prometheus-operated -n prometheus-operator
[root@rhel-2EFK]# oc get route -n prometheus-operator
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
prometheus prometheus-prometheus-operator.apps.9.37.135.153.nip.io prometheus web None
-
Visit the prometheus route and go to the Prometheus targets page. At this point, the page should be empty with no endpoints being discovered.
-
Look at the
prometheus.yaml
file and update the serviceMonitorNamespaceSelector and serviceMonitorSelector fields. The ServiceMonitor needs to satisfy the matching requirement for both selectors before it can be picked up by the Prometheus service, like in theprometheus_snippet.yaml
file. In this case,our ServiceMonitor has the k8s-app label, but the target namespace "myapp" is missing the required prometheus: monitoring label.
Updateprometheus.yaml
to reflect theprometheus_snippet.yaml
file
-
Add the label to the "myapp" namespace.
[root@rhel-2EFK]# oc label namespace myapp prometheus=monitoring
-
Check to see that the Prometheus targets page is picking up the target endpoints. If the service endpoint is discovered, but Prometheus is reporting a DOWN status, you need to make the prometheus-operator project to be globally accessible.
oc adm pod-network make-projects-global prometheus-operator
For users who just migrated their applications to Openshift and define their own Prometheus configuration file, using the Prometheus Operator is not the only option for Prometheus deployments. You can deploy Prometheus by using the example yaml file provided by the Openshift origin repository.
[root@rhel7-openshift ~]# oc new-project Prometheus
-
Deploy the Prometheus using the sample
prometheus.yaml
file from here
[root@rhel7-openshift ~]# oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/prometheus/prometheus.yaml -p NAMESPACE=prometheus
-
Edit the "prometheus" ConfigMap resource from the prometheus namespace.
[root@rhel7-openshift ~]# oc edit configmap/prometheus -n prometheus
scrape_configs.yaml
link:code/scrape_configs.yaml[role=include]
scrape_configs_snippet.yaml
link:code/scrape_configs_snippet.yaml[role=include]
-
Remove all exiting jobs and add the following
scrap_configs
job. -
Kill the existing Prometheus pod, or better yet, reload the Prometheus service gracefully using the command below for the new configuration to take effect.
[root@rhel7-openshift ~]# oc exec prometheus-0 -c prometheus -- curl -X POST http://localhost:9090/-/reload
Make sure the monitored application's pods are started with the following annotations as specified in the prometheus ConfigMap's scrape_configs.
-
Verify the scrape target is up and available in Prometheus by using Prometheus’s web console as follows: Click Console → Status → Targets.
If the service endpoint is discovered, but Prometheus is reporting a DOWN status, you need to make the prometheus-operator project globally accessible.
oc adm pod-network make-projects-global prometheus
Regardless of which approach was used to deploy Prometheus on Openshift, use Grafana for dashboard and visualization of the metrics. Use the sample grafana.yaml
file provided by Openshift in the Openshift origin git repository to install Grafana. NOTE: Perform the following steps to ensure that Prometheus endpoints are reachable as a data source in Grafana.
-
Create a new project called grafana.
[root@rhel-2EFK ~]# oc new-project grafana
-
Deploy Grafana using the
grafana.yaml
file from the Openshift origin repository.
[root@rhel-2EFK ~]# oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/grafana/grafana.yaml -p NAMESPACE=grafana
-
Grant the grafana service account view access to the prometheus (or prometheus operator) namespace
[root@rhel-2EFK ~]# oc policy add-role-to-user view system:serviceaccount:grafana:grafana -n prometheus
grafana-datasources.yaml
link:code/grafana-datasources.yaml[role=include]
-
For Grafana to add existing prometheus datasources in Openshift, define the datasources in a ConfigMap resource under the grafana namespace. Create a ConfigMap yaml file called
grafana-datasources.yaml
. -
Apply the
grafana-datasources.yaml
file to create the ConfigMap resource.
[root@rhel-2EFK ~]# oc apply -f grafana-datasources.yaml
-
Acquire the [grafana-ocp token] by using the following command.
[root@rhel-2EFK ~]# oc sa get-token grafana
-
Add the ConfigMap resource to the Grafana application and mount it to
/usr/share/grafana/datasources
.
-
Save and test the data source. You should see 'Datasource is working'.
You can now consume all the application metrics gathered by Prometheus on the Grafana dashboard.