Monitor various metrics for KVM/qemu guests and output in a top-like fashion. Also aggregate the usage/allocation metrics at the NUMA node and host-level.
It can output as text on the console and write CSV files. The graph-vmtop.py
script generates graphs from those CSV files (record with --csv <dir>
).
This script can also be ran with a Prometheus exporter enabled, like so:
sudo ./vmtop.py --prometheus [host:port]
The host and port are optional, and will default to localhost:8000 if not specified.
Example output for the top 10 VMs experiencing the most vcpu steal per node:
$ sudo ./vmtop.py --vm --limit 10 --sort vcpu_steal
[...]
Node 0:
Name PID vcpu util vcpu steal vhost util vhost steal emu util emu steal disk read disk write rx tx
Guest01 48642 141.73 % 32.86 % 0.30 % 0.06 % 0.57 % 0.48 % 0.00 MB/s 0.00 MB/s 0.00 MB/s 0.00 MB/s
Guest02 16830 219.07 % 29.74 % 28.45 % 22.13 % 0.54 % 1.30 % 0.00 MB/s 0.01 MB/s 2.49 MB/s 2.43 MB/s
Guest03 17152 119.61 % 27.87 % 13.89 % 16.31 % 0.48 % 0.12 % 0.00 MB/s 0.00 MB/s 1.45 MB/s 1.39 MB/s
Guest04 60782 52.90 % 16.70 % 0.79 % 0.75 % 0.61 % 0.19 % 0.00 MB/s 0.01 MB/s 0.02 MB/s 0.03 MB/s
Guest05 33077 46.47 % 12.69 % 2.02 % 1.82 % 3.11 % 0.97 % 0.00 MB/s 2.06 MB/s 0.24 MB/s 0.14 MB/s
Guest06 39435 41.70 % 10.01 % 17.36 % 12.85 % 0.98 % 0.08 % 0.00 MB/s 0.00 MB/s 0.27 MB/s 0.30 MB/s
Guest07 5196 26.49 % 9.28 % 0.03 % 0.00 % 0.51 % 0.09 % 0.00 MB/s 0.00 MB/s 0.00 MB/s 0.00 MB/s
Guest08 56822 55.63 % 8.77 % 9.33 % 5.61 % 0.53 % 0.17 % 0.00 MB/s 0.00 MB/s 0.30 MB/s 0.30 MB/s
Guest09 65751 36.59 % 8.54 % 6.26 % 3.42 % 0.76 % 0.06 % 0.08 MB/s 0.04 MB/s 0.18 MB/s 0.18 MB/s
Guest10 25320 26.61 % 8.44 % 0.06 % 0.00 % 5.30 % 2.72 % 0.00 MB/s 0.06 MB/s 0.00 MB/s 0.00 MB/s
Node 0: vcpu util: 80.85%, vcpu steal: 6.49%, emulators util: 1.71%, emulators steal: 0.89%
Node 0: 79 VMs (168 vcpus, 306.00 GB mem allocated, 231.98 GB mem used)
Node 1:
Name PID vcpu util vcpu steal vhost util vhost steal emu util emu steal disk read disk write rx tx
Guest11 14506 29.49 % 0.70 % 0.00 % 0.00 % 0.45 % 0.01 % 0.00 MB/s 0.01 MB/s 0.00 MB/s 0.00 MB/s
Guest12 52580 28.69 % 0.63 % 4.55 % 0.26 % 0.60 % 0.08 % 0.12 MB/s 0.00 MB/s 0.16 MB/s 0.15 MB/s
Guest13 60864 36.45 % 0.56 % 0.06 % 0.00 % 4.25 % 0.06 % 0.00 MB/s 0.03 MB/s 0.00 MB/s 0.00 MB/s
Guest14 69426 64.98 % 0.54 % 11.06 % 0.79 % 0.09 % 0.00 % 0.00 MB/s 0.02 MB/s 4.20 MB/s 3.87 MB/s
Guest15 52138 52.45 % 0.39 % 10.66 % 0.64 % 0.75 % 0.02 % 0.22 MB/s 0.00 MB/s 0.43 MB/s 0.38 MB/s
Guest16 38308 57.05 % 0.25 % 12.03 % 0.93 % 1.18 % 0.03 % 0.51 MB/s 0.00 MB/s 0.47 MB/s 0.56 MB/s
Guest17 7700 11.87 % 0.25 % 0.08 % 0.00 % 4.16 % 0.05 % 0.00 MB/s 0.03 MB/s 0.00 MB/s 0.00 MB/s
Guest18 39542 39.46 % 0.24 % 7.61 % 0.51 % 0.77 % 0.02 % 0.16 MB/s 0.00 MB/s 0.27 MB/s 0.27 MB/s
Guest19 1381 49.03 % 0.24 % 9.01 % 0.72 % 0.76 % 0.05 % 0.25 MB/s 0.00 MB/s 0.35 MB/s 0.43 MB/s
Guest20 6945 36.15 % 0.24 % 6.42 % 0.45 % 0.77 % 0.04 % 0.21 MB/s 0.00 MB/s 0.23 MB/s 0.36 MB/s
Node 1: vcpu util: 41.88%, vcpu steal: 0.20%, emulators util: 1.87%, emulators steal: 0.08%
Node 1: 131 VMs (161 vcpus, 210.00 GB mem allocated, 193.42 GB mem used)
The tool currently depends on numastat
being in the $PATH
.
Besides that dependency, one design rule of this tool is that it should be easy to just download it and run without any additional setup. We use very common Python modules and the advanced features such as Prometheus and eBPF are optional.
This tool has been mostly tested on Ubuntu Bionic 18.04, but should be easy to run on other platforms.
The optional dependencies are:
python3-daemon
for the the--daemon
optionpython3-bcc
for the--vmexit
optionpython3-prometheus-client
for the--prometheus
option
The node-level information assumes a VM mostly runs where most of its memory is allocated, if a VM can float between NUMA nodes, the node-level information may not be accurate (and a warning will be shown). The VM-level data is accurate regardless of the pinning.
This tool is a quick script to solve the issue of live monitoring resource allocation and usage by VMs. We plan to keep improving it over time.
In the extras
folder, there are other ad-hoc tools for monitoring various
performance aspects of KVM. The tools are in various states of maintenance,
feel free to reach out if you have questions or suggestions for improvements.
The features from those scripts may end up in vmtop at some point, but for now
they are tested outside.
bpftrace
tool to check statistically how long a vCPU spends inside the guest
when it is scheduled in.
The core-sched-stats.py
script ensures that the core scheduling
feature works properly and accounts for the time spent by a vCPU in various
scheduling modes (co-scheduled with idle, with a compatible task, or a foreign
task).
This script works with perf trace recorded:
sudo perf record -e kvm:kvm_entry -e kvm:kvm_exit -e sched:sched_switch -e sched:sched_waking -o perf.data -a sleep 60
and can be converted to CTF like so (requires perf compiled with CTF support):
perf data convert --to-ctf=./ctf -i perf.data
If you see this message and the number of chunks is greater than 1 or 2, consider writing your perf.data
to a ramdisk instead
of your local disk:
Processed 7378132 events and lost 1 chunks!
Check IO/CPU overload!
It depends on Babeltrace compiled with the python library and Perf compiled with the CTF support. On bionic:
apt-get install libbabeltrace-dev libbabeltrace-ctf-dev python3-babeltrace babeltrace
Then rebuild perf
so it will detect the new library.
In order to run kvm_co_sched_stats.py
, the siblings list must be provided in a text file with the --topology
flag.
To collect this data, run this on the target HV:
for i in /sys/devices/system/cpu/cpu*/topology/thread_siblings_list; do cat $i; done > out.txt