sirun
(pronounced like "siren") is a tool for taking basic perfomance
measurements of a process covering its entire lifetime. It gets memory and
timing information from the kernel and also allows
Statsd messages to be sent to
udp://localhost:$SIRUN_STATSD_PORT
(the port is assigned randomly by sirun,
but you can also set it yourself), and those will be included in the outputted
metrics.
It's intended that this tool be used for shorter-running benchmarks, and not for long-lived processes that don't die without external interaction. You could certainly use it for long-lived processes, but that's not where it shines.
cargo install sirun
Release bundles are provided for each supported plaform. Extract the binary somewhere and use it.
Make sure you have rustup
installed, and use that to
ensure you have the latest stable Rust toolchain enabled.
cargo install .
With SSH
cargo install --git ssh://git@github.com:22/DataDog/sirun.git --branch main
or with HTTPS
cargo install --git https://github.com/DataDog/sirun.git --branch main
See also the documentation.
Create a JSON or YAML file with the following properties:
name
: This will be included in the results JSON.run
: The command to run and test. You can format this like a shell command with arguments, but note that it will not use a shell as an intermediary process. Note that subprocesses will not be measured via the kernel, but they can still use Statsd. To send metrics to Statsd from inside this process, send them toudp://localhost:$SIRUN_STATSD_PORT
.service
: A command to start a process to be run alongside your test process. This is for, for example, running a web service for your program to call out to, or a load-generating tool for your program. It should generally be used in conjunction withsetup
, which can be used to determine whether theservice
process is ready. There is no retry logic. After the test run has completed, the process will be sent a SIGKILL.setup
: A command to run before the test. Use this to ensure the availability of services, or retrieve some last-minute dependencies. This can be formatted the same way asrun
. It will be run repeatedly at 1 second intervals until it exits with status code 0.teardown
: A command to run after the test. This is run in the same manner assetup
, except after the test has run instead of before.timeout
: If provided, this is the maximum time, in seconds, arun
test can run for. If it times out,sirun
will exit with no results, aborting the test.env
: A set of environment variables to make available to therun
andsetup
programs. This should be an object whose keys are the environment variable names and whose values are the environment variable values.iterations
: The number of times to run the therun
test. The results for each iteration will be in aniterations
array in the resultant JSON. The default is 1.cachegrind
: If set totrue
, will run the test (after having already run it normally) using cachegrind to add an instruction count to the results JSON. Will only happen once, and be inserted into the top level JSON, regardless ofiterations
. This requires thatvalgrind
is installed on your system.instructions
: If set totrue
, will take instruction counts from hardware counters if available, adding the result under the keyinstructions
, for each iteration. This is only available on Linux withCAP_SYS_PTRACE
.variants
: An array or object whose values are config objects, whose properties may be any of the properties above. It's not recommended to includename
in a variant. The variant name (ifvariants
is an object) or index (ifvariants
is an array) will be included in resultant JSON.
GIT_COMMIT_HASH
: If set, will include aversion
in the results.SIRUN_NAME
: If set, will include aname
in the results. This overrides anyname
property set in config JSON/YAML.SIRUN_NO_STDIO
: If set, supresses output from the tested program.SIRUN_VARIANT
: Selects which variant of the test to run. If thevariants
property exists in the config JSON/YAML, and this variable is not set, then all variants will be run, one-by-one, each having its own line of output JSON.SIRUN_STATSD_PORT
: The UDP port on localhost to use for Statsd communication between tested processes and sirun. By default a random port will be assigned. You should read this variable from tested programs to determine which port to send data to.
Here's an example JSON file. As an example of a setup
script, it's checking for
connectivity to Google. The run
script doesn't do much, but it does send a
single metric (with name udp.data
and value 50) via Statsd. It times out after
4 seconds, and we're not likely to reach that point.
There are two variants of this test, one named control
and the other with-tracer
.
The variants set environment variables, though the run
script doesn't really care
about the variable. Since there are 3 iterations and 2 variants the run
command will
run a total of 6 times.
{
"name": "foobar",
"setup": "curl -I http://www.google.com -o /dev/null",
"run": "bash -c \"echo udp.data:50\\|g > /dev/udp/127.0.0.1/$SIRUN_STATSD_PORT\"",
"timeout": 4,
"iterations": 3,
"variants": {
"control": {
"env": { "USE_TRACER": "0" }
},
"with-tracer": {
"env": { "USE_TRACER": "1" }
}
}
}
You can then pass this JSON file to sirun
on the command line. Remember that
you can use environment variables to set the git commit hash and test name in
the output.
SIRUN_NAME=test_some_stuff GIT_COMMIT_HASH=123abc sirun ./my_benchmark.json
This will output something like the following.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
{"version":"123abc","name":"test_some_stuff",iterations:[{"user.time":6389.0,"system.time":8737.0,"udp.data":50.0,"max.res.size":2240512.0}]}
If you provide the --summarize
option, sirun
will switch to summary mode. In
summary mode, it will read from stdin
, expecting line-by-line of output from
previous sirun runs. It will then aggregate them by test name and variant, and
provide summary statistics over iterations. The output is pretty-printed JSON.
E.g.
$ sirun foo-test.json >> results.ndjson
$ sirun bar-test.json >> results.ndjson
$ sirun baz-test.json >> results.ndjson
$ cat results.ndjson | sirun --summarize > summary.json
Each line of output in one of these .ndjson
files is a complete JSON document.
Here's an example of one of these lines of output, though whitespace has been added for readability:
{
"name": "foobar",
"variant": "with-baz",
"iterations": [
{
"cpu.pct.wall.time": 18.138587328535383,
"max.res.size": 66956,
"system.time": 766091,
"user.time": 1552557,
"wall.time": 12782958
},
{
"cpu.pct.wall.time": 18.029851850045276,
"max.res.size": 66480,
"system.time": 720491,
"user.time": 1571242,
"wall.time": 12710770
}
]
}
name
: This is the samename
value from the configuration file.variant
: This is the object key from the configuration file'svariations
list.iterations
: These are the statsd metrics from different runs and contain raw datamax.res.size
: Kilobytes (KiB) maximum Resident Set Size (RSS), aka the highest RAM usagesystem.time
: Microsecond (μs) amount of time spent in kernel codeuser.time
: Microsecond (μs) amount of time spent in application codewall.time
: Microsecond (μs) amount of time the overall iteration tookcpu.pct.wall.time
: Percentage (%) of time where the program was not waiting ((user + system) / wall
)
The listed statsd metrics in this list are automatically created for you by Sirun. Your application is free to emit other metrics as well. Those additional metrics will also be provided in the output.
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.