Skip to content

Commit

Permalink
feat: add default node anti-affinity for all operator pods
Browse files Browse the repository at this point in the history
  • Loading branch information
mmanciop authored and basti1302 committed Jan 24, 2025
1 parent 631e25b commit 7303e3d
Show file tree
Hide file tree
Showing 4 changed files with 76 additions and 0 deletions.
25 changes: 25 additions & 0 deletions helm-chart/dash0-operator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,31 @@ By default, the operator collects metrics as follows:

Disabling or enabling individual metrics via configuration is currently not supported.

### Preventing Operator Scheduling on Specific Nodes

All the pods deployed by the operator have a default node anti-affinity for the `dash0.com/enable=false` node label.
That is, if you add the `dash0.com/enable=false` label to a node, none of the pods owned by the operator will schedule
on that node.

**IMPORTANT:** This includes the daemonset that the operator will set up to receive telemetry from the pods, which might
leads to situations in which instrumented pods cannot send telemetry because the local node does not have a daemonset
pod.
In other words, if you want to monitor workloads with the Dash0 operator and use the `dash0.com/enable=false` node
anti-affinity, make sure that the workloads you want to monitor have the same anti-affinity:

```yaml
# Add this to your workloads
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "dash0.com/enable"
operator: "NotIn"
values: ["false"]
```

### Disabling Auto-Instrumentation for Specific Workloads

In namespaces that are Dash0-monitoring enabled, all supported workload types are automatically instrumented for
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,14 @@ spec:
{{- include "dash0-operator.podLabels" . | nindent 8 }}
{{- end }}
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "dash0.com/enable"
operator: "NotIn"
values: ["false"]
containers:
- name: manager
image: {{ include "dash0-operator.image" . | quote }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,15 @@ deployment should match snapshot (default values):
app.kubernetes.io/name: dash0-operator
dash0.com/cert-digest: dJTiBDRVJUSUZJQ
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: dash0.com/enable
operator: NotIn
values:
- "false"
automountServiceAccountToken: true
containers:
- args:
Expand Down
34 changes: 34 additions & 0 deletions internal/backendconnection/otelcolresources/desired_state.go
Original file line number Diff line number Diff line change
Expand Up @@ -451,6 +451,23 @@ func assembleCollectorDaemonSet(config *oTelColConfig, resourceSpecs *OTelColRes
Labels: daemonSetMatchLabels,
},
Spec: corev1.PodSpec{
Affinity: &corev1.Affinity{
NodeAffinity: &corev1.NodeAffinity{
RequiredDuringSchedulingIgnoredDuringExecution: &corev1.NodeSelector{
NodeSelectorTerms: []corev1.NodeSelectorTerm{
{
MatchExpressions: []corev1.NodeSelectorRequirement{
{
Key: dash0OptOutLabelKey,
Operator: corev1.NodeSelectorOpNotIn,
Values: []string{"false"},
},
},
},
},
},
},
},
ServiceAccountName: daemonsetServiceAccountName(config.NamePrefix),
SecurityContext: &corev1.PodSecurityContext{},
// This setting is required to enable the configuration reloader process to send Unix signals to the
Expand Down Expand Up @@ -911,6 +928,23 @@ func assembleCollectorDeployment(
Labels: deploymentMatchLabels,
},
Spec: corev1.PodSpec{
Affinity: &corev1.Affinity{
NodeAffinity: &corev1.NodeAffinity{
RequiredDuringSchedulingIgnoredDuringExecution: &corev1.NodeSelector{
NodeSelectorTerms: []corev1.NodeSelectorTerm{
{
MatchExpressions: []corev1.NodeSelectorRequirement{
{
Key: dash0OptOutLabelKey,
Operator: corev1.NodeSelectorOpNotIn,
Values: []string{"false"},
},
},
},
},
},
},
},
ServiceAccountName: deploymentServiceAccountName(config.NamePrefix),
SecurityContext: &corev1.PodSecurityContext{},
// This setting is required to enable the configuration reloader process to send Unix signals to the
Expand Down

0 comments on commit 7303e3d

Please sign in to comment.