Skip to content

Commit

Permalink
Merge pull request #31 from projectsyn/feat/cluster-monitoring-label
Browse files Browse the repository at this point in the history
Add label selectors for monitors and rules
  • Loading branch information
glrf authored Jun 22, 2022
2 parents 388948c + fee5246 commit 1f341f6
Show file tree
Hide file tree
Showing 38 changed files with 175 additions and 21 deletions.
28 changes: 23 additions & 5 deletions component/addons/cluster-monitoring.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,33 @@ local kube = import 'lib/kube.libjsonnet';

prometheus+: {
spec+: {
local selector = {
local nsSelector = {
matchLabels: {
['monitoring.syn.tools/%s' % config.values.prometheus.name]: 'true',
},
},
serviceMonitorNamespaceSelector+: selector,
podMonitorNamespaceSelector+: selector,
probeNamespaceSelector+: selector,
ruleNamespaceSelector+: selector,
local optOutSelector = {
matchExpressions: [ {
key: 'monitoring.syn.tools/enabled',
operator: 'NotIn',
values: [ 'false', 'False' ],
} ],
},
local optInSelector = {
matchExpressions: [ {
key: 'monitoring.syn.tools/enabled',
operator: 'In',
values: [ 'true', 'True' ],
} ],
},
serviceMonitorNamespaceSelector+: nsSelector,
serviceMonitorSelector+: optOutSelector,
podMonitorNamespaceSelector+: nsSelector,
podMonitorSelector+: optOutSelector,
probeNamespaceSelector+: nsSelector,
probeSelector+: optOutSelector,
ruleNamespaceSelector+: nsSelector,
ruleSelector+: optInSelector,
},
},
},
Expand Down
19 changes: 15 additions & 4 deletions component/main.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
local com = import 'lib/commodore.libjsonnet';
local kap = import 'lib/kapitan.libjsonnet';
local kube = import 'lib/kube.libjsonnet';

local lib = import 'lib/prometheus.libsonnet';

local inv = kap.inventory();
// The hiera parameters for the component
local params = inv.parameters.prometheus;
Expand Down Expand Up @@ -102,11 +105,19 @@ local instances = std.mapWithKey(
instanceStacks
);

local enableAlert(name, object) =
if object.kind == 'PrometheusRule' then
lib.Enable(object)
else object;

(import 'operator.libsonnet')
+ namespaces
+ secrets
+ std.foldl(
function(prev, i) prev + instances[i],
std.objectFields(instances),
{}
+ std.mapWithKey(
enableAlert,
std.foldl(
function(prev, i) prev + instances[i],
std.objectFields(instances),
{}
)
)
16 changes: 13 additions & 3 deletions docs/modules/ROOT/pages/how-tos/cluster-monitoring.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,14 @@ However it also supports picking up metrics and alerts from other workloads depl

To enable monitoring other components we simply need to add the `cluster-monitoring` addon.

This sets namespace slectors on every Prometheus instance that will result in them picking up all `ServiceMonitors`, `PodMonitors`, `Probes`, and `PrometheusRules` in namespaces with the label `monitoring.syn.tools/<instance>`.
This sets namespace selectors on every Prometheus instance that will result in them picking up all `ServiceMonitors`, `PodMonitors`, and `Probes` in namespaces with the label `monitoring.syn.tools/<instance>`.

The example blow will make the Prometheus instance `default-instance` pick up all namespaces with a label `monitoring.syn.tools/default-instance: "true"`.
`PrometheusRules` are only picked up if they're in a labeled namespace AND the rule is labeled with `monitoring.syn.tools/enabled: "true"`.
This should ensure that rules are added consciously and prevent a massive import of upstream alerts that aren't actionable or don't meet our standards.

You also have the option to disable `ServiceMonitors`, `PodMonitors`, or `Probes` by labeling them with `monitoring.syn.tools/enabled: "false"`.

The example below will make the Prometheus instance `default-instance` pick up all namespaces with a label `monitoring.syn.tools/default-instance: "true"`.

.Example
[source,yaml]
Expand Down Expand Up @@ -43,14 +48,17 @@ parameters:
When writing a component we can advertise metrics and rules to Prometheus by creating `ServiceMonitors` or `PrometheusRules` and correctly labeling the namespace of the component.
To do this, the component-prometheus provides helper functions as the library `lib/prometheus.libsonnet`.

The component namespace can easily be annotated by using the `RegisterNamespace` function.
The component namespace can easily be labeled by using the `RegisterNamespace` function.
This function takes a namespace and returns the provided namespace with additional necessary labels for Prometheus to pick it up.

The function `NetworkPolicy` returns a network policy that allows ingress traffic from the Prometheus namespace.
This means when writing a component you don't need to know where Prometheus is deployed.

We also provide helper functions to create `ServiceMonitors`, `PodMonitors`, `Probes`, and `PrometheusRules`.

The `PrometheusRule` helper function already ensures that the necessary `enabled` label is set.
If you need to enable an existing `PrometheusRule` you can use the `Enable()` helper functions to set the label.


.Example
[source,jsonnet]
Expand All @@ -70,13 +78,15 @@ We also provide helper functions to create `ServiceMonitors`, `PodMonitors`, `Pr
'10_servicemonitor': prometheus.ServiceMonitor('foo'){ <3>
...
},
'10_alert': prometheus.Enable(upstreamAlert), <4>
}
----
<1> Add a label so the default instance will pick up the namespace
<2> Depending on the cluster distribution you will need to add a NetworkPolicy.
Without it Prometheus won't be able to scape the targets.
The `NetworkPolicy` functions will provide a correctly configured NetworkPolicy to allow ingress traffic from the Prometheus instance.
<3> Create a ServiceMonitor called 'foo' that's guaranteed to be picked up by Prometheus.
<4> Assuming there is an existing `upstreamAlert` rule you can enable it using the `Enable` helper function.

WARNING: Don't create a NetworkPolicy for permissive clusters without default NetworkPolicies.
Doing so will drop any traffic not originating from Prometheus.
37 changes: 36 additions & 1 deletion lib/prometheus.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -75,13 +75,45 @@ local api_version = {
monitoring: 'monitoring.coreos.com/v1',
};

/**
* \brief Helper to enable Monitor or Rule
*
* The`cluster-monitoring` addon needs to be enabled for this to have an effect.
*
* \arg A ServiceMonitor, PodMonitor, Probe, or PrometheusRule
* \return The given object with the necessary labels for Prometheus to pick it up
*/
local enable(object) = object {
metadata+: {
labels+: {
'monitoring.syn.tools/enabled': 'true',
},
},
};

/**
* \brief Helper to disable Monitor or Rule
*
* The`cluster-monitoring` addon needs to be enabled for this to have an effect.
*
* \arg A ServiceMonitor, PodMonitor, Probe, or PrometheusRule
* \return The given object with the necessary labels that makes Prometheus ignore it
*/
local disable(object) = object {
metadata+: {
labels+: {
'monitoring.syn.tools/enabled': 'false',
},
},
};

/**
* \brief Helper to create PrometheusRule objects.
*
* \arg The name of the PrometheusRule.
* \return A PrometheusRule object.
*/
local prometheusRule(name) = kube._Object(api_version.monitoring, 'PrometheusRule', name);
local prometheusRule(name) = enable(kube._Object(api_version.monitoring, 'PrometheusRule', name));

/**
* \brief Helper to create ServiceMonitor objects.
Expand Down Expand Up @@ -112,6 +144,9 @@ local probe(name) = kube._Object(api_version.monitoring, 'Probe', name);
RegisterNamespace: registerNamespace,
NetworkPolicy: networkPolicy,

Enable: enable,
Disable: disable,

PrometheusRule: prometheusRule,
ServiceMonitor: serviceMonitor,
PodMonitor: podMonitor,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,19 +33,37 @@ spec:
podMonitorNamespaceSelector:
matchLabels:
monitoring.syn.tools/default-instance: 'true'
podMonitorSelector: {}
podMonitorSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
probeNamespaceSelector:
matchLabels:
monitoring.syn.tools/default-instance: 'true'
probeSelector: {}
probeSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
replicas: 2
resources:
requests:
memory: 400Mi
ruleNamespaceSelector:
matchLabels:
monitoring.syn.tools/default-instance: 'true'
ruleSelector: {}
ruleSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: In
values:
- 'true'
- 'True'
securityContext:
fsGroup: 2000
runAsNonRoot: true
Expand All @@ -54,5 +72,11 @@ spec:
serviceMonitorNamespaceSelector:
matchLabels:
monitoring.syn.tools/default-instance: 'true'
serviceMonitorSelector: {}
serviceMonitorSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
version: 2.29.1
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.29.1
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: prometheus-default-instance-prometheus-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,19 +33,37 @@ spec:
podMonitorNamespaceSelector:
matchLabels:
monitoring.syn.tools/other_instance: 'true'
podMonitorSelector: {}
podMonitorSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
probeNamespaceSelector:
matchLabels:
monitoring.syn.tools/other_instance: 'true'
probeSelector: {}
probeSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
replicas: 2
resources:
requests:
memory: 400Mi
ruleNamespaceSelector:
matchLabels:
monitoring.syn.tools/other_instance: 'true'
ruleSelector: {}
ruleSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: In
values:
- 'true'
- 'True'
securityContext:
fsGroup: 2000
runAsNonRoot: true
Expand All @@ -54,5 +72,11 @@ spec:
serviceMonitorNamespaceSelector:
matchLabels:
monitoring.syn.tools/other_instance: 'true'
serviceMonitorSelector: {}
serviceMonitorSelector:
matchExpressions:
- key: monitoring.syn.tools/enabled
operator: NotIn
values:
- 'false'
- 'False'
version: 2.29.1
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.29.1
monitoring.syn.tools/enabled: 'true'
prometheus: other_instance
role: alert-rules
name: prometheus-other_instance-prometheus-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.26.0
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: prometheus-default-instance-prometheus-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.21.0
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: alertmanager-alertmanager-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: nodeexporter-default-instance
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 1.1.2
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: nodeexporter-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ metadata:
app.kubernetes.io/managed-by: commodore
app.kubernetes.io/name: kube-prometheus
app.kubernetes.io/part-of: kube-prometheus
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: kubernetes-monitoring-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: kubestatemetrics-default-instance
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.0.0
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: kubestatemetrics-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.29.1
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: prometheus-default-instance-prometheus-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.22.2
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: alertmanager-alertmanager-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: nodeexporter-default-instance
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 1.2.2
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: nodeexporter-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ metadata:
app.kubernetes.io/managed-by: commodore
app.kubernetes.io/name: kube-prometheus
app.kubernetes.io/part-of: kube-prometheus
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: kubernetes-monitoring-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ metadata:
app.kubernetes.io/name: kubestatemetrics-default-instance
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.1.1
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: kubestatemetrics-default-instance-rules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ metadata:
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.32.1
monitoring.syn.tools/enabled: 'true'
prometheus: default-instance
role: alert-rules
name: prometheus-default-instance-prometheus-rules
Expand Down
Loading

0 comments on commit 1f341f6

Please sign in to comment.