-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for task-level metrics #85
Comments
Where would you like this data to end up, ultimately? Perfherder? |
Perfherder would be a good initial repository for some data. But Perfherder is aimed at tracking specific per-repository metrics over repository time. It can't do things like track aggregate counts of events across all tasks. (Maybe it can in the database. But the UI is heavily tailored towards things like Talos results.) I feel like we're abusing Perfherder for things like tracking build times and compiler warnings. When all you have is a hammer... |
FWIW, I have a half-concocted patch to add some really hacky output parsing to But with an in-task solution like |
"Collect metrics from automation" is a wheel that we keep reinventing. The following are used in Firefox CI:
PERFHERDER_DATA
special syntax log lines that get picked up Treeherder's log ingestion system. The raw data gets exposed on Perfherder.By implementing metrics collection within tasks, we frequently deal with the following problems:
PERFHERDER_DATA
hack is the closest thing we have. We keep writing tools that watch things and emitPERFHERDER_DATA
.PERFHERDER_DATA
blob at the end.At the very least, I think TC should report resource utilization for tasks. Wall time. CPU time. Average CPU utilization. I/O counters. Maximum memory utilization. Etc. It doesn't have to be consistent across platforms. Report when you can easily and without a significant probe overhead and we can iterate from there.
I think it would be really rad if TC could recognize metrics data from special syntax in task output. For example, if a task emitted lines with
BEGIN_PHASE foo
andEND_PHASE foo
, TC could record the times of various phases and then use that for correlating to resource utilization, displaying timelines of events, etc. This would allow all tasks to code to a universal "metrics language" and metrics would "just work."Random technical thoughts:
abort:\s.*
) and then having the worker managing that task parse for these and treat them specially is an interesting idea. It allows you to do things like automatically set anchors to "interesting" parts of logs and to create metrics from specific output patterns. The latter is useful when you don't have control over process output and need to invent a metrics signal from non-structured output.CC @luser
The text was updated successfully, but these errors were encountered: