Skip to content

Commit

Permalink
Set up autoscaling for Butler server
Browse files Browse the repository at this point in the history
Enable pod auto-scaling for Butler server to allow for more concurrent requests.
  • Loading branch information
dhirving committed Jan 13, 2025
1 parent fd400b8 commit 9824c6a
Show file tree
Hide file tree
Showing 5 changed files with 29 additions and 10 deletions.
6 changes: 3 additions & 3 deletions applications/butler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ Server for Butler data abstraction service
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | Affinity rules for the butler deployment pod |
| autoscaling.enabled | bool | `false` | Enable autoscaling of butler deployment |
| autoscaling.maxReplicas | int | `100` | Maximum number of butler deployment pods |
| autoscaling.enabled | bool | `true` | Enable autoscaling of butler deployment |
| autoscaling.maxReplicas | int | `10` | Maximum number of butler deployment pods Each replica can have 40 database connections, so we need to make sure the combined connections are under the postgres connection limit. (Which is configurable, but currently set to 400 at the IDF.) |
| autoscaling.minReplicas | int | `1` | Minimum number of butler deployment pods |
| autoscaling.targetCPUUtilizationPercentage | int | `80` | Target CPU utilization of butler deployment pods |
| autoscaling.targetCPUUtilizationPercentage | int | `25` | Target CPU utilization of butler deployment pods Butler CPU usage is very low in normal operation because most things are I/O bound. CPU usage can start creeping up if we have many queries running simultaneously (due to serialization overhead and spatial postprocessing.) In this case the thread pool and database connection pool are probably oversubscribed long before we hit 100% cpu usage, so we want to get more replicas up at fairly low CPU usage. |
| config.additionalS3EndpointUrls | object | No additional URLs | Endpoint URLs for additional S3 services used by the Butler, as a mapping from profile name to URL. |
| config.dp02ClientServerIsDefault | bool | `false` | True if the 'dp02' Butler repository alias should use client/server Butler. False if it should use DirectButler. |
| config.dp02PostgresUri | string | No configuration file for DP02 will be generated. | Postgres connection string pointing to the registry database hosting Data Preview 0.2 data. |
Expand Down
10 changes: 7 additions & 3 deletions applications/butler/templates/hpa.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: "butler"
Expand All @@ -17,12 +17,16 @@ spec:
- type: Resource
resource:
name: "cpu"
targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: "memory"
targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
2 changes: 2 additions & 0 deletions applications/butler/values-idfint.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
autoscaling:
minReplicas: 3
config:
dp02ClientServerIsDefault: true
dp02PostgresUri: postgresql://[email protected]:5432/dp02
Expand Down
2 changes: 2 additions & 0 deletions applications/butler/values-idfprod.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
autoscaling:
minReplicas: 3
config:
dp02ClientServerIsDefault: true
dp02PostgresUri: postgresql://[email protected]/idfdp02
Expand Down
19 changes: 15 additions & 4 deletions applications/butler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,27 @@ ingress:

autoscaling:
# -- Enable autoscaling of butler deployment
enabled: false
enabled: true

# -- Minimum number of butler deployment pods
minReplicas: 1

# -- Maximum number of butler deployment pods
maxReplicas: 100
#
# Each replica can have 40 database connections, so we need to make sure the
# combined connections are under the postgres connection limit. (Which is
# configurable, but currently set to 400 at the IDF.)
maxReplicas: 10

# -- Target CPU utilization of butler deployment pods
targetCPUUtilizationPercentage: 80
#
# Butler CPU usage is very low in normal operation because most things are
# I/O bound. CPU usage can start creeping up if we have many queries running
# simultaneously (due to serialization overhead and spatial postprocessing.)
# In this case the thread pool and database connection pool are probably
# oversubscribed long before we hit 100% cpu usage, so we want to get more
# replicas up at fairly low CPU usage.
targetCPUUtilizationPercentage: 25
# targetMemoryUtilizationPercentage: 80

# -- Annotations for the butler deployment pod
Expand All @@ -45,7 +56,7 @@ resources:
# 40 threads in the thread pool running large queries costing ~35MB each.
memory: "1.5Gi"
requests:
cpu: "15m"
cpu: "1"
# Butler server uses around 200MB idle at startup, but under dynamic usage
# Python seems to want to hold onto another couple hundred megabytes of
# heap.
Expand Down

0 comments on commit 9824c6a

Please sign in to comment.