Long term performance monitoring using CI #8450

abadams · 2024-10-29T17:58:32Z

#8447 adds some reporting of thread pool performance stats to CI, but you have to go in and copy-paste it from stdout on each bot if you want to use it for something.

It would be cool if we had two things:

A web interface where you could query and graph performance information over time. E.g. local laplacian seems to be slow on mac. When did that start happening? What commit was responsible? Has LLVM 20 been an improvement over LLVM 19 for us?
A performance report (e.g. posted as a github comment) on each PR that summarizes the performance information. It would be triggered somehow once all builds are green. Ideally it would compare it to the most recent performance information from a build of main on the same bot.

I'm imagining augmenting all code that reports a runtime worth tracking with a helper that detects if it's running on a buildbot (e.g. by checking for the environment var HL_SEND_RUNTIMES_TO_BUILDBOT_MASTER), and if so uploads the hostname, commit, branch, llvm version, time and date, some identifier for the datum, and a runtime (or list of identifiers and runtimes?). It could go to the buildbot master, or maybe it can go directly to some rest api for some logging service.

One piece of complexity is that if we have two bots with different configurations, either of which can run a build, we're not going to get performance stats from both of them for a given PR. The ratio over the last run of main on that bot would still be useful though.

alexreinking · 2024-10-30T13:56:27Z

by checking for the environment var HL_SEND_RUNTIMES_TO_BUILDBOT_MASTER ... maybe it can go directly to some rest api

CTest has some features for submitting extended test results to a CI system. We're already using them to get per-test error outputs on the buildbot dashboard. It would probably be best to continue building in this direction, rather than to add an HTTP client to our C++ test dependencies.

abadams · 2024-10-30T15:02:55Z

It wouldn't need an HTTP client dependency. It's just opening a tcp socket and sending a formatted string. It's not a lot of code to do that with standard system includes.

I want it to go somewhere where I can write an sql-like query and have it draw me a graph in a webpage. That's going to involve generating the data in a more structured way than a log, and sending it somewhere somehow. It's fine with me if ctest is involved, but I'm skeptical it would be easier to put the logic there than in a c++ header included by the tests.

abadams added enhancement New user-visible features or improvements to existing features. contributor project labels Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long term performance monitoring using CI #8450

Long term performance monitoring using CI #8450

abadams commented Oct 29, 2024

alexreinking commented Oct 30, 2024

abadams commented Oct 30, 2024

Long term performance monitoring using CI #8450

Long term performance monitoring using CI #8450

Comments

abadams commented Oct 29, 2024

alexreinking commented Oct 30, 2024

abadams commented Oct 30, 2024