Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark framework #428

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

Woazboat
Copy link
Contributor

@Woazboat Woazboat commented Jul 17, 2024

Add a benchmarking framework using the https://github.com/google/benchmark library.

Could be useful to judge the impact of changes like #427 or for finding performance regressions.

Not really polished yet and this PR is mainly intended for discussion if this is wanted or not. The included benchmarks are also mostly just to be used as an example.

Usage:
pg_virtualenv ./cgimap-benchmark
or e.g.
pg_virtualenv ./cgimap-benchmark --benchmark_out=bm_result.json --benchmark_out_format=json --benchmark_repetitions=5

@mmd-osm
Copy link
Collaborator

mmd-osm commented Jul 27, 2024

I think google benchmark could be a good fit for cpu bounded activities, like parsing XML/JSON or generating XML/JSON responses, maybe also including different compression settings. We could define different scenarios with varying number of nodes/ways/relations and create/modify/delete operations, as needed. Overall, this would be totally fine and a good enhancement.

With database queries, we have a lot more boundary conditions, like the database version, any parameter settings, storage system, up-to-date indexes, existing data, test data volume, etc. In particular the last two points are a bit tricky, because queries could be fast on a nearly empty database, and very slow once we reach several 100 million entries (see OHM experience). They could be slow due to some optimizer decision, or due to some poor query design. The next scenario is handling parallel requests (like multiple users uploading changes at the same time). Also here we need to make sure that we're properly scaling across requests, that users are not locking each other, etc.

Long story short, I'm not convinced that google-benchmark is the right tool for this use case. I'm currently using different command line tools to simulate single/multiple users as well as existing editing apps for testing, and some off the shelf tools for db analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants