[ Back to index ]
Click here to see the table of contents.
Tutorial: reproducibility study for TinyMLPerf submission with MicroTVM and NUCLEO-L4R5ZI board from STMicroelecronics
The MLCommons task force on automation and reproducibility, cTuning foundation and cKnowledge Ltd organize public challenges to let the community run, visualize and optimize MLPerf benchmarks out of the box across diverse software, hardware, models and data.
This tutorial demonstrates how to run and/or reproduce Tiny MLPerf benchmark (OctoML v1.0 submission) with the help of the MLCommons CM automation language.
You will build, flash and run image classification and keyword spotting applications using microTVM compiler on the NUCLEO-L4R5ZI board from STMicroelectronics.
You will need ~12GB of disk space and it will take ~20..30 minutes to download all dependencies and build TinyMLPerf benchmarks depending on your Internet and host platform speed.
Benchmark compilation and device flashing can be done on any Linux-based platform while running benchmark using EEMBC GUI can be done on Linux and Windows.
If you have any questions about this tutorial, please get in touch via our public Discord server or open a GitHub issue here.
Please follow this tutorial to install the MLCommons CM automation language, EEMBC Energy Runner and other software dependencies for your host platform, and setup NUCLEO-L4R5ZI board from STMicroelecronics.
We reproduced/replicated OctoML's v1.0 submission using a host machine with Ubuntu 20.04 and Python 3.8.10.
You can use CM script to automatically build all benchmarks in all variants to reproduce OctoML's v1.0 submission:
cm run script --tags=generate,tiny,mlperf,octoml,submission
The main CM scripts which automatically gets called from the above command are given below.
The above command should produce five elf binaries which can be located inside the respective cache entries given by the below command
cm show cache --tags=reproduce,tiny,octoml,mlperf
To flash each benchmark, follow the command bellow. Make sure to replace VARIANT
by either cmsis_nn
or native
.
You need to specify the model by replacing MODEL
with a value from (ad
, kws
, ic
, vww
).
Finally, you need to choose _NUCLEO
or _NRF
to specify the target board to flash.
cm run script --tags=flash,tiny,_VARIANT,_MODEL,_BOARD
We have tested the following combinations:
cm run script --tags=flash,tiny,_cmsis_nn,_ic,_NUCLEO
cm run script --tags=flash,tiny,_native,_ic,_NUCLEO
cm run script --tags=flash,tiny,_cmsis_nn,_kws,_NUCLEO
cm run script --tags=flash,tiny,_native,_kws,_NUCLEO
After each flashing, follow the EEMBC Runner guide to run benchmark in performance and accuracy modes.
You can find the logs after each run in the following directory on your host machine:
$HOME/eembc/runner/sessions
.
Follow this guide.
Follow this guide
You can visualize and compare TinyMLPerf results here. You can use this collaborative platform inside your organization to reproduce and optimize benchmarks and applications of your interest.
Please follow the rest of this tutorial to see how to visualize and compare your results, and learn more about our future automation plans.
Please join the MLCommons task force on automation and reproducibility to get free help to automate and optimize MLPerf benchmarks for your software and hardware stack using the MLCommons CM automation language!