Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service level metrics #2

Open
elephantum opened this issue Aug 10, 2018 · 2 comments
Open

Service level metrics #2

elephantum opened this issue Aug 10, 2018 · 2 comments
Labels
help wanted Extra attention is needed

Comments

@elephantum
Copy link
Contributor

Implement technical metrics of webui, scheduler and executors like uptime, an average time to reread dagbag etc.

@elephantum elephantum added the help wanted Extra attention is needed label Aug 10, 2018
@smith-m
Copy link

smith-m commented Aug 10, 2018

It would probably make sense to outline what the minimum of these metrics should be.

Also what mechanisms are suggested to inject an http interface in process for scheduler and executor. In terms of providing an http component for metrics, this is probably in line with providing an http interface for a /health endpoint. (https://issues.apache.org/jira/browse/AIRFLOW-1084?jql=project%20%3D%20AIRFLOW%20AND%20text%20~%20health)

Also to note, there is one other prometheus plugin for airflow (https://github.com/airflow-plugins/pandora-plugin/blob/master/blueprints/metrics_blueprint.py) But to my knowledge, it also lacks this functionality

@f1yegor
Copy link

f1yegor commented Sep 17, 2019

Some of the mentioned above metrics are available via statsd statistics of airflow itself. I'm using statsd-prometheus exporter to monitor them, the downside is that they are not prometheus-way of naming(e.g. airflow_dag_<dag name>_duration)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants