Add performance tracking machinery #1509

serramatutu · 2024-11-05T21:14:44Z

NOTE: I had to significantly refactor the original version of this PR to make it use cProfile, so I changed the description and comments to match.

Summary

This PR introduces some new testing functionality that we can use to avoid performance regressions. From a user perspective, the performance benchmarking workflow is:

Create a new test in tests_metricflow/performance/ that does something we want to track performance for. You can use measure_compilation_performance fixtures to measure the compilation of some SQL based on a query spec and a manifest.
Run make perf to generate a performance-report.json.
Run make perf-compare A=base-report.json B=other-report.json to generate a Markdown file with a report on changes.

The performance report

The performance-report.json file contains the "raw" data for a given run. It looks like this:

{
    "sessions": {
        "test_simple_manifest.py::test_simple_query": {
            "session_id": "test_simple_manifest.py::test_simple_query",
            "total_time": 0.04,
            "functions": {
                "source_scan_optimizer.py:115 :: __init__": {
                    "function_name": "source_scan_optimizer.py:115 :: __init__",
                    "base_calls": 1,
                    "total_calls":1,
                    "body_time": 0.01,
                    "per_call_body_time": 0.01,
                    "total_time": 0.03,
                    "per_call_total_time": 0.03
                },
                ...
            }
        },
        ...
    }
 }

Each "report" contains a list of "sessions". A session is a run where multiple measurements have been taken, i.e a pytest test. Each session contains a variety of "functions", which contain runtime statistics about each one of them.

Report comparison

I created a script called compare_reports.py which takes in two different reports files and calculates the difference between them (in absolute values and also percent). It then generates a markdown file called performance-comparison.md with a nice little table like the one on my first comment.

I added a time.sleep(1) to build_plan() so that I could get a non-negligible result on the first comment.

How it works

When you run make perf, the measure_compilation_performance fixture instantiates a PerformanceTracker and yields a _measure() function that will compile a query while running cProfile.

The tests then all run, and at the end of it all the fixture compiles all reports by calling perf_tracker.report_set. This will return all the gathered SessionReports. These reports contain all the data from all functions.

After this, it writes the report to the JSON file.

Then, when you run make perf-compare the script loads both the JSON files and uses the compare() methods in each of the report classes. These will generate Comparison classes with the percent and absolute differences. The script then renders these classes to markdown and writes it to a file. This script filters the comparisons to only show the top 10 most relevant comparisons, if their total runtime difference non-negligible (i.e the percent difference is higher than 10e-7). If any total runtime has increased by over 10%, it will add warnings ( ⚠️ ) to the file.

Next steps

Make a Github Actions bot that runs this on PRs to compare performance against main, and comments PRs with the generated markdown.

github-actions · 2024-11-05T21:15:00Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

serramatutu · 2024-11-05T21:15:21Z

Performance comparison

Comparing performance-report-sleep.json against performance-report.json

Reporting top 10 highest non-negligible absolute total time differences for each session.

Legend

Function: the function name
Base calls: number of calls, excluding recursion
Total calls: number of calls, including recursion
Body time: time spent on the function body, excluding sub-function calls
Total time: time spent on the function body and in other sub-functions it calls

n.d. means "negligible difference", i.e the difference between measured outputs is negligible, and any differences could be down to imprecision, so the reported value is only the absolute measured value

Warnings ( ⚠️ ) will be emitted whenever there is more than 10% increase in total runtime of a test.

`test_simple_manifest.py::test_simple_query` ⚠️ ⚠️

Total time: 1s - 0s = 1s (+96.774%)

i	Function	Base calls	Total calls	Body time (cumulative)	Body time (per call)	Total time (cumulative)	Total time (per call)
#1	`dataflow_plan_builder.py:139 :: build_plan`	1 (n.d.)	1 (n.d.)	0s (n.d.)	0s (n.d.)	1s - 0s = 1s (+97.987%)	0s (n.d.)

`test_simple_manifest.py::test_simple_query_2` ⚠️ ⚠️

Total time: 1s - 0s = 1s (+96.967%)

i	Function	Base calls	Total calls	Body time (cumulative)	Body time (per call)	Total time (cumulative)	Total time (per call)
#1	`dataflow_plan_builder.py:139 :: build_plan`	1 (n.d.)	1 (n.d.)	0s (n.d.)	0s (n.d.)	1s - 0s = 1s (+98.178%)	0s (n.d.)

metricflow-semantics/metricflow_semantics/test_helpers/performance_helpers.py

This commit adds the basic functionality needed for measuring performance. This includes: - classes to measure performance in different sessions using context managers - a context manager that can be used to instrument certain methods with performance tracking machinery - test fixtures for automatically generating reports using these classes

This commit adds `Comparison` classes which contain comparisons between performance measurements taken at two different points in time. It uses these classes in a new script that can be called to generate a markdown report of what changed regarding performance.

plypaul

🎉

cla-bot bot added the cla:yes label Nov 5, 2024

serramatutu requested review from plypaul and courtneyholcomb November 5, 2024 21:14

plypaul reviewed Nov 5, 2024

View reviewed changes

metricflow-semantics/metricflow_semantics/test_helpers/performance_helpers.py Outdated Show resolved Hide resolved

serramatutu added 2 commits January 24, 2025 11:00

serramatutu force-pushed the serramatutu/perf-tracking branch from c7eb66b to cb61a7d Compare January 24, 2025 10:00

serramatutu requested a review from a team as a code owner January 24, 2025 10:00

serramatutu force-pushed the serramatutu/perf-tracking branch from 2584d73 to e2493f6 Compare January 24, 2025 15:29

Refactor to use cProfile

f598cb0

serramatutu force-pushed the serramatutu/perf-tracking branch from e2493f6 to f598cb0 Compare January 24, 2025 15:56

plypaul approved these changes Jan 24, 2025

View reviewed changes

plypaul added the Skip Changelog label Jan 24, 2025

plypaul merged commit d10d263 into main Jan 24, 2025
24 of 25 checks passed

plypaul deleted the serramatutu/perf-tracking branch January 24, 2025 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance tracking machinery #1509

Add performance tracking machinery #1509

serramatutu commented Nov 5, 2024 •

edited

Loading

github-actions bot commented Nov 5, 2024

serramatutu commented Nov 5, 2024 •

edited

Loading

plypaul left a comment

Add performance tracking machinery #1509

Add performance tracking machinery #1509

Conversation

serramatutu commented Nov 5, 2024 • edited Loading

Summary

The performance report

Report comparison

How it works

Next steps

github-actions bot commented Nov 5, 2024

serramatutu commented Nov 5, 2024 • edited Loading

Performance comparison

Legend

test_simple_manifest.py::test_simple_query ⚠️ ⚠️

test_simple_manifest.py::test_simple_query_2 ⚠️ ⚠️

plypaul left a comment

Choose a reason for hiding this comment

serramatutu commented Nov 5, 2024 •

edited

Loading

serramatutu commented Nov 5, 2024 •

edited

Loading

`test_simple_manifest.py::test_simple_query` ⚠️ ⚠️

`test_simple_manifest.py::test_simple_query_2` ⚠️ ⚠️