You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new query doesn't correctly filter by job_type (notice the filter clause is job.job_type = job.job_type):
SELECT job.id, job.dag_id, job.state, job.job_type, job.start_date, job.end_date, job.latest_heartbeat, job.executor_class, job.hostname, job.unixname
FROM job
WHERE job.job_type = job.job_type AND job.hostname = %(hostname_1)s ORDER BY job.latest_heartbeat DESC
LIMIT %(param_1)s
Old query used in helm chart v8.7.0 and earlier used to filter by job_type:
SELECT job.id, job.dag_id, job.state, job.job_type, job.start_date, job.end_date, job.latest_heartbeat, job.executor_class, job.hostname, job.unixname
FROM job
WHERE job.hostname = %(hostname_1)s AND job.job_type IN (__[POSTCOMPILE_job_type_1]) ORDER BY job.latest_heartbeat DESC
LIMIT %(param_1)s
While the new query does work, its inefficiency is can cause database IO to be throttled, resulting in liveness probe failures due to query timeouts.
Relevant Logs
Warning Unhealthy 7m29s (x5 over 10m) kubelet Liveness probe failed: The SchedulerJob (id=97704647) for hostname 'airflow-scheduler-8586fc7d9f-br7vb' is not alive
Custom Helm Values
No response
The text was updated successfully, but these errors were encountered:
Checks
User-Community Airflow Helm Chart
.Chart Version
8.7.1
Kubernetes Version
Helm Version
Description
This change to the liveness probes in support of airflow 2.6.0 is causing high usage database on IO when using airflow 2.5.3.
From what I can tell, the part of the query that is causing poor performance is this change:
I believe the issue is that the SchedulerJob class in 2.5.3 doesn't set a value for
job_type
.The new query doesn't correctly filter by
job_type
(notice the filter clause isjob.job_type = job.job_type
):Old query used in helm chart v8.7.0 and earlier used to filter by
job_type
:While the new query does work, its inefficiency is can cause database IO to be throttled, resulting in liveness probe failures due to query timeouts.
Relevant Logs
Custom Helm Values
No response
The text was updated successfully, but these errors were encountered: