[ Back to index ]
News (May 2024): our task force has successfully accomplished the first goal to provide a stable CM interface for MLPerf benchmarks and discussing the next steps with MLCommons - please stay tuned for more details!
- Extend MLCommons CM workflow automation framework and reusable automation recipes (CM scripts) to automate MLCommons projects and make it easier to assemble, run, reproduce, customize and optimize ML(Perf) benchmarks in a unified and automated way across diverse models, data sets, software and hardware from different vendors.
- Extend CM workflows to automate and reproduce MLPerf inference submissions from different vendors starting from v3.1.
- Encode MLPerf rules and best practices in the CM automation recipes and workflows for MLPerf to help MLPerf submitters avoid going through many README files and track all the latest MLPerf changes and updates.
We thank cKnowledge.org, cTuning.org, and MLCommons for sponsoring this project!
If you found CM useful, please cite this article: [ ArXiv ], [ BibTex ].
-
Continue improving CM to support different MLCommons projects for universal benchmarking and optimization across different platforms.
-
Extend CM workflows to reproduce MLPerf inference v4.0 submissions (Intel, Nvidia, Qualcomm, Google, Red Hat, etc) via a unified interface.
-
Prepare tutorial for MLPerf inference v4.1 submissions via CM.
-
Discuss how to introduce the CM automation badge to MLPerf inference v4.1 submission similar to ACM/IEEE/NeurIPS reproducibility badges to make it easier for all submitters to re-run and reproduce each others’ results before the publication date.
-
Develop a more universal Python and C++ wrapper for the MLPerf loadgen with the CM automation to support different models, data sets, software and hardware: Python prototype; C++ prototype.
-
Collaborate with system vendors and cloud providers to help them benchmark their platforms using the best available MLPerf inference implementation.
-
Collaborate with other MLCommons working groups to autoamte, modularize and unify their benchmarks using CM automation recipes.
-
Use CM to modularize and automate the upcoming automotive benchmark.
-
Use MLCommons Croissant to unify MLPerf datasets.
- Improving CM workflow automation framework: GitHub ticket
- Updating/refactoring CM docs (framework and MLPef workflows): GitHub ticket
- Improving CM scripts to support MLPerf: GitHub ticket
- Adding profiling and performance analysis during benchmarking: GitHub ticket
- Improving universal build and run scripts to support cross-platform compilation: GitHub ticket
- Automate ABTF benchmarking via CM: GitHub ticket
- Help automate MLPerf inference benchmark at the Student Cluster Competition'24: GitHub ticket
-
Developed reusable and technology-agnostic automation recipes and workflows with a common and human-friendly interface (MLCommons Collective Mind aka CM) to modularize MLPerf inference benchmarks and run them in a unified and automated way across diverse models, data sets, software and hardware from different vendors.
-
Added GitHub actions to test MLPerf inference benchmarks using CM.
-
Encoded MLPerf inference rules and best practices in the CM automation recipes and workflows for MLPerf and reduced the burden for submitters to go through numerous README files and track all the latest changes and reproduce results.
-
Automated MLPerf inference submissions and made it easier to re-run and reproduce results (see submitters orientation and CM-MLPerf documentation).
-
Started developing an open-source platform to automatically compose high-performance and cost-effective AI applications and systems using MLPerf and CM (see our presentation at MLPerf-Bench at HPCA’24).
-
Supported AI, ML and Systems conferences to automate artifact evaluation and reproducibility initiatives (see CM at ACM/IEEE MICRO’23 and SCC’23/SuperComputing’23).
- CM GitHub project
- CM concept (keynote at ACM REP'23)
- CM Getting Started Guide
- CM-MLPerf commands
- CM-MLPerf GUI
- ACM artifact review and badging methodology
- Artifact Evaluation at ML and systems conferences
- Terminology (ACM/NISO): Repeatability, Reproducibility and Replicability
- CM motivation (ACM TechTalk about reproducing 150+ research papers and validating them in the real world)
This task force was established by Grigori Fursin after he donated his CK and CM automation technology to MLCommons in 2022 to benefit everyone. Since then, this open-source technology is being developed as a community effort based on user feedback. We would like to thank all our volunteers, collaborators and contributors for their support, fruitful discussions, and useful feedback!