CI benchmarking suite #533

MilesCranmer · 2024-07-29T09:59:34Z

This defines a simple BenchmarkTools.jl suite based on BENCHMARKS.md for the actual benchmark code. Feel free to add other things we should be keeping track of or maybe some integration benchmarks.

This also uses AirspeedVelocity.jl for automatic CI output which parses the output of the benchmark results. Basically every pull request will get a comment like SymbolicML/DynamicExpressions.jl#94 (comment) with all of the performance info. It also measures the change in load time which is quite useful.

I find it nice for catching performance regressions in PRs. It is used in these packages: https://github.com/search?q=/Pkg.build%5C(%22AirspeedVelocity%22%5C)/+language:YAML&type=code&l=YAML

You can run this benchmark suite with:

# julia -e 'using Pkg; pkg"add AirspeedVelocity"; pkg"build AirspeedVelocity"'
benchpkg

By default it will just compare main to dirty, but you can test over version history with

benchpkg -r v0.9.18,v0.9.19,v0.9.20,v0.9.21,dirty

which gives me the output:

	v0.9.18	v0.9.19	v0.9.20	v0.9.21	dirty
`@py/pydict/init`	0.198 ± 0.0057 ms	0.201 ± 0.0077 ms	0.199 ± 0.007 ms	0.197 ± 0.0075 ms	0.197 ± 0.0065 ms
`@py/pydict/pydel`	0.304 ± 0.0067 ms	0.208 ± 0.0061 ms	0.208 ± 0.0071 ms	0.203 ± 0.0073 ms	0.203 ± 0.0077 ms
`julia/pydict/init`	0.193 ± 0.0072 ms	0.2 ± 0.0076 ms	0.193 ± 0.0072 ms	0.193 ± 0.0097 ms	0.19 ± 0.0073 ms
`julia/pydict/pydel`	0.196 ± 0.0093 ms	0.21 ± 0.0065 ms	0.198 ± 0.0084 ms	0.195 ± 0.0079 ms	0.195 ± 0.0084 ms
`time_to_load`	0.708 ± 0.013 s	0.707 ± 0.017 s	0.707 ± 0.009 s	0.688 ± 0.046 s	0.692 ± 0.033 s

View all the options with benchpkg -h.

MilesCranmer · 2024-08-02T15:32:50Z

I added @ericphanson's GC benchmarks here too via git cherry-pick

benchmark/benchmarks.jl

MilesCranmer · 2024-08-02T16:38:56Z

Here's the new results, comparing #529 with main:

	main	dirty	main/dirty
basic/@py/pydict/init	0.194 ± 0.006 ms	0.209 ± 0.0064 ms	0.925
basic/@py/pydict/pydel	0.199 ± 0.007 ms	0.217 ± 0.0072 ms	0.915
basic/julia/pydict/init	0.187 ± 0.014 ms	0.195 ± 0.0074 ms	0.959
basic/julia/pydict/pydel	0.191 ± 0.0064 ms	0.209 ± 0.0067 ms	0.917
gc/full	1.07 ± 0.031 s	1.19 ± 0.012 s	0.901
time_to_load	0.666 ± 0.049 s	0.666 ± 0.0045 s	1

Does this make sense? A bit slower GC, but now with thread safety?

benchmark/benchmarks.jl

MilesCranmer and others added 10 commits July 29, 2024 10:45

create simple benchmark suite

d0a97df

more hierarchy in benchmark

410e31c

create AirspeedVelocity github action for benchmarks

cd19489

fix imports in benchmark

072423d

fix default branch name in CI

5f61a9c

benchmarks: more hierarchy

fcc687a

timing + benchmark

4f78589

benchmarks: include gcbench in runs

397ef3f

benchmarks: remove unused code

37c89d7

benchmarks: modularize

5295cc7

ericphanson reviewed Aug 2, 2024

View reviewed changes

benchmark/benchmarks.jl Outdated Show resolved Hide resolved

benchmarks: give more time to GC benchmark

5d4e630

benchmarks: simplify naming scheme

ae73e43

ericphanson reviewed Aug 2, 2024

View reviewed changes

benchmark/benchmarks.jl Show resolved Hide resolved

benchmarks: avoid issue of tuneing away from evals=1

83ddd09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI benchmarking suite #533

CI benchmarking suite #533

MilesCranmer commented Jul 29, 2024 •

edited

Loading

MilesCranmer commented Aug 2, 2024 •

edited

Loading

MilesCranmer commented Aug 2, 2024 •

edited

Loading

CI benchmarking suite #533

Are you sure you want to change the base?

CI benchmarking suite #533

Conversation

MilesCranmer commented Jul 29, 2024 • edited Loading

MilesCranmer commented Aug 2, 2024 • edited Loading

MilesCranmer commented Aug 2, 2024 • edited Loading

MilesCranmer commented Jul 29, 2024 •

edited

Loading

MilesCranmer commented Aug 2, 2024 •

edited

Loading

MilesCranmer commented Aug 2, 2024 •

edited

Loading