Merge branch 'master' of https://github.com/ctuning/mlcommons-ck

mlcommons · Sep 21, 2023 · f9d27d9 · f9d27d9
2 parents 9bb8ab6 + 9614c5d
commit f9d27d9
Show file tree

Hide file tree

Showing 773 changed files with 17,367 additions and 4,068 deletions.
diff --git a/.github/workflows/test-cm-tutorial-tvm-pip.yml b/.github/workflows/test-cm-tutorial-tvm-pip.yml
@@ -12,8 +12,29 @@ on:
       - '!cm-mlops/**.md'
 
 jobs:
-  build:
+  test_vm_runtime:
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.9"]
+
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v3
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install cmind
+        cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
+        cm run script --quiet --tags=get,sys-utils-cm
+    - name: Test CM Tutorial TVM pip install with VirtualMachine Runtime
+      run: |
+        python tests/tutorials/test_tutorial_tvm_pip_vm.py
 
+  test_ge_runtime:
     runs-on: ubuntu-latest
     strategy:
       fail-fast: false
@@ -31,6 +52,6 @@ jobs:
         python -m pip install cmind
         cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
         cm run script --quiet --tags=get,sys-utils-cm
-    - name: Test CM Tutorial TVM pip install
+    - name: Test CM Tutorial TVM pip install with GraphExecutor Runtime
       run: |
-        python tests/tutorials/test_tutorial_tvm_pip.py
+        python tests/tutorials/test_tutorial_tvm_pip_ge.py
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -16,91 +16,55 @@ Ensure that cla-bot and other checks pass for your Pull requests.
 
 ## Contributing to this project
 
-We suggest you to join the [open MLCommons Collective Knowledge task force](docs/taskforce.md)
-to learn how to use the CK technology, CM scripting language, enhance existing CM components for MLOps and DevOps, 
-run MLPerf benchmarks and contribute your own artifacts, scripts and workflows in the CM format.
+Please join our [Discord server](https://discord.gg/JjWNWXKxwT)
+to learn about how to use the CK technology v3 (including the MLCommons CM automation language, CK playground
+and Modular Inference Library) or participate in collaborative developments.
 
 Thank you for your support and looking forward to collaborating with you!
 
-## Authors and maintainers
+## Authors and project coordinators
 
-* [Grigori Fursin](https://cKnowledge.org/gfursin)
-* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh)
+* [Grigori Fursin](https://cKnowledge.org/gfursin) ([cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org))
+* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) ([cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org))
 
-## Contributors in alphabetical order
+## Contributors to the Collective Knowledge Technology v3 (CM automation language and CK playground) in alphabetical order
 
-* Sam Ainsworth (University of Cambridge, UK)
-* Saheli Bhattacharjee (@sahelib25)
+* Resmi Arjun
 * Ethan Cheng (Nvidia)
 * Jiahao Chen (MIT)
-* Gianfranco Costamagna
-* Chris Cummins (Facebook)
-* Valentin Dalibard &lt;[email protected]&gt;
-* Alastair Donaldson &lt;[email protected]&gt;
-* Thibaut Dumontet
-* Himanshu Dutta
-* Daniil Efremov (Xored)
-* Leonid Fursin
-* Todd Gamblin (LLNL)
-* Chandan Reddy Gopal (ENS Paris)
-* Leo Gordon (dividiti)
-* Dave Greasley (University of Bristol)
-* Herve Guillou
-* Vincent Grevendonk (Arm)
+* Ray DeMoss (One Stop Systems)
+* Himanshu Dutta (Indian Institute of Technology)
+* Justin Faust (One Stop Systems)
+* Leonid Fursin (United Silicon Carbide)
 * Michael Goin (Neural Magic)
-* Christophe Guillon (STMicroelectronics)
-* Sven van Haastregt (Arm)
-* Michael Haidl
-* Stephen Herbein (LLNL)
-* Mehrdad Hessar (OctoML): support for TinyML automation
-* Patrick Hesse (College of Saint Benedict and Saint John's University)
-* Nikolay Istomin (Xored)
-* Kenan Kalajdzic
+* Jose Armando Hernandez (Paris Saclay University)
+* Mehrdad Hessar (OctoML)
+* Miro Hodak (AMD)
+* Nino Jacob
 * David Kanter (MLCommons)
-* Yuriy Kashnikov 
 * Jason Knight (OctoML)
-* Alexey Kravets (Arm)
-* Michael Kruse &lt;[email protected]&gt;
-* Andrei Lascu &lt;[email protected]&gt;
-* Anton Lokhmotov 
-* Peter Mattson (Google)
-* Graham Markall &lt;[email protected]&gt;
-* Michael Mcgeagh (Arm)
-* Abdul Wahid Memon &lt;[email protected]&gt;
+* Ilya Kozulin (Deelvin)
+* @makaveli10 (Collabora)
+* Peter Mattson (Google, MLCommons)
+* Kasper Mecklenburg (Arm)
 * Thierry Moreau (OctoML)
 * Sachin Mudaliyar
 * Stanley Mwangi (Microsoft)
-* Luigi Nardi 
-* Cedric Nugteren &lt;[email protected]&gt;
-* Lucas Nussbaum (Universite de Lorraine)
-* Ivan Ospiov (Xored)
-* Lakshman Patel @Patel230
-* Egor Pasko (Google)
-* Ed Plowman (Arm)
-* Lahiru Rasnayake (NTNU)
+* Ashwin Nanjappa (Nvidia)
+* Nandeeka Nayak (UIUC)
+* Datta Nimmaturi (Nutanix)
+* Lakshman Patel
 * Vijay Janapa Reddi (Harvard University)
-* Alex Redshaw (Arm)
-* Vincent Rehm
-* Toomas Remmelg (University of Edinburgh)
-* Andrew Reusch (OctoML): support for TinyML automation
-* Jarrett Revels (MIT)
-* Jared Roesch (OctoML)
-* Dmitry Savenko (Xored)
-* Thomas Schmid (OctoML)
-* Aditya Kumar Shaw
-* Gavin Simpson (Arm)
-* Aaron Smith (Microsoft)
+* Andrew Reusch (OctoML)
+* Anandhu S (Kerala Technical University)
+* Warren Schultz (Principled Technologies)
+* Amrutha Sheleenderan (Kerala Technical University)
+* Byoungjun Seo (TTA)
+* Aditya Kumar Shaw (Indian Institute of Science)
 * Ilya Slavutin (Deelvin)
-* Michel Steuwer (University of Edinburgh)
-* Chloe Tessier (Illustrations for CK/CM presentations)
-* Flavio Vella (Free University of Bozen-Bolzano)
+* David Taufur (MLCommons)
+* Chloe Tessier
 * Gaurav Verma (Stony Brook University)
-* Emanuele Vitali
-* Dave Wilkinson (University of Pittsburgh)
-* Sergey Yakushkin (Synopsys)
-* Eiko Yoneki &lt;[email protected]&gt;
-* Thomas Zhu (Oxford University) &lt;[email protected]&gt;
-* @filven
-* @ValouBambou
-
-See more acknowledgments at the end of this [article](https://arxiv.org/abs/2011.01149).
+* Haoyang Zhang  (UIUC)
+* Bojian Zheng (University of Toronto)
+* Thomas Zhu (Oxford University)
diff --git a/README.md b/README.md
@@ -5,72 +5,128 @@
 [![CM test](https://github.com/mlcommons/ck/actions/workflows/test-cm.yml/badge.svg)](https://github.com/mlcommons/ck/actions/workflows/test-cm.yml)
 [![CM script automation features test](https://github.com/mlcommons/ck/actions/workflows/test-cm-script-features.yml/badge.svg)](https://github.com/mlcommons/ck/actions/workflows/test-cm-script-features.yml)
 
-### Documentation and the Getting Started Guide
-
-[Table of contents](docs/README.md)
 
 ### About
 
-The "Collective Knowledge" project (CK) is motivated by the [feedback from researchers and practitioners](https://learning.acm.org/techtalks/reproducibility)
-while reproducing results from more than 150 research papers and validating them in the real world - 
-there is a need for a common and technology-agnostic framework
-that can facilitate reproducible research and simplify technology transfer to production
-across diverse and rapidly evolving software, hardware, models, and data.
-It consists of the following sub-projects:
+The [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md), 
+[cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org) 
+are developing Collective Knowledge v3 - an open-source technology 
+enabling collaborative, reproducible, automated and unified benchmarking, optimization and comparison of AI, ML and other emerging workloads
+across diverse and rapidly evolving models, data sets, software and hardware from different vendors.
+
+Collective Knowledge v3 includes:
+* [Non-intrusive, technology-agnostic and plugin-based Collective Mind automation language](cm)
+* [Collective Knowledge Playground](https://access.cKnowledge.org)
+* [Modular Inference Library](https://cknowledge.org/mil)
+
+[The community](https://access.cknowledge.org/playground/?action=challenges) successfully validated CM automation language and CK playground to automate > 90% of all [MLPerf inference v3.1 results](https://mlcommons.org/en/news/mlperf-inference-storage-q323/) 
+and cross 10000 submissions in one round for the first time (submitted via [cTuning foundation](https://cTuning.org))!
+Here is the [list of the new CM/CK capabilities](docs/news-mlperf-v3.1.md) available to everyone 
+to prepare and automate their future MLPerf submissions.
+
+See related [HPC Wire'23 article](https://www.hpcwire.com/2023/09/13/mlperf-releases-latest-inference-results-and-new-storage-benchmark),
+[ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339), 
+[ACM Tech Talk](https://learning.acm.org/techtalks/reproducibility) 
+and [MLPerf submitters orientation](https://doi.org/10.5281/zenodo.8144274) 
+to learn more about our open-source developments and long-term vision.
+
+Join our [public Discord server](https://discord.gg/JjWNWXKxwT) to learn how to run and extend MLPerf benchmarks, participate in future MLPerf submissions, 
+automate reproducibility initiatives at ACM/IEEE/NeurIPS conferences and co-design efficient AI Systems.
+
+### Documentation
+
+* [Table of contents](docs/README.md)
+
+### Upcoming events
+
+* [CM automation language makes it easier to reproduce experiments from the accepted ACM/IEEE MICRO'23 papers](https://github.com/ctuning/cm-reproduce-research-projects/tree/main/script)
+* [CK/CM authors will give a tutorial about CM automation language and CK playground at IISWC'23](https://iiswc.org/iiswc2023/#/program/)
+* [CM automation language and CK playground will help students run MLPerf inference benchmark at the Student Cluster Competition at SuperComputing'23](https://sc23.supercomputing.org/students/student-cluster-competition)
+
+*More events to come soon!*
+
+
+### Some practical use cases
 
-* [Collective Mind scripting language (MLCommons CM)](cm) 
-  is intended to help researchers and practitioners
-  describe all the steps required to reproduce their experiments across any software, hardware, and data
-  in a common and technology-agnostic way.
-  It is powered by Python, JSON and/or YAML meta descriptions, and a unified CLI.
-  CM can automatically generate unified README and synthesize unified containers with a common API
-  while reducing all the tedious, manual, repetitive, and ad-hoc efforts to validate research projects in production.
-  It is used in the same way in native environments, Python virtual environments, and containers.
+* [CM installation](https://github.com/mlcommons/ck/blob/master/docs/installation.md)
+* [All CM tutorials](https://github.com/mlcommons/ck/blob/master/docs/tutorials)
 
-  See a few real-world examples of using the CM scripting language:
-  - [README to reproduce published IPOL'22 paper](cm-mlops/script/app-ipol-reproducibility-2022-439)
-  - [README to reproduce MLPerf RetinaNet inference benchmark at Student Cluster Competition'22](docs/tutorials/sc22-scc-mlperf.md)
-  - [Auto-generated READMEs to reproduce official MLPerf BERT inference benchmark v3.0 submission with a model from the Hugging Face Zoo](https://github.com/mlcommons/submissions_inference_3.0/tree/main/open/cTuning/code/huggingface-bert/README.md)
-  - [Auto-generated Docker containers to run and reproduce MLPerf inference benchmark](cm-mlops/script/app-mlperf-inference/dockerfiles/retinanet)
+#### Run Python Hello World app
 
-* [Collective Mind scripts (MLCommons CM scripts)](cm-mlops/script) 
-  provide a low-level implementation of the high-level and technology-agnostic CM language.
+```bash
+python3 -m pip install cmind
+# restart bash to add cm and cmr binaries to PATH
 
-* [Collective Knowledge platform (MLCommons CK playground)](platform) 
-  aggregates [reproducible experiments](https://access.cknowledge.org/playground/?action=experiments) 
-  in the CM format, connects academia and industry to 
-  [organize benchmarking, reproducibility, replicability and optimization challenges]( https://github.com/mlcommons/ck/tree/master/cm-mlops/challenge ),
-  and help developers and users select Pareto-optimal end-to-end applications and systems based on their requirements and constraints
-  (cost, performance, power consumption, accuracy, etc).
+cm pull repo mlcommons@ck
+cm run script --tags=print,python,hello-world
+cmr "print python hello-world"
+```
 
+This CM script is a simple wrapper to native scripts and tools
+described by a simple declarative YAML configuration file
+specifying inputs, environment variables and dependencies on other portable
+and shared [CM scripts](https://github.com/mlcommons/ck/tree/master/cm-mlops/script):
 
-### Collaborative development
+```yaml
+alias: print-hello-world-py
+uid: d83274c7eb754d90
 
-This open-source technology is being developed by the public
-[MLCommons task force on automation and reproducibility](docs/taskforce.md)
-led by [Grigori Fursin](https://cKnowledge.org/gfursin) and
-[Arjun Suresh](https://www.linkedin.com/in/arjunsuresh).
-The goal is to connect academia and industry to develop, benchmark, compare, synthesize, 
-and deploy Pareto-efficient AI and ML systems and applications 
-(optimal trade off between performance, accuracy, power consumption, and price)
-in a unified, automated and reproducible way while slashing all development and operational costs.
+automation_alias: script
+automation_uid: 5b4e0237da074764
 
-* Join our [public Discord server](https://discord.gg/JjWNWXKxwT).
-* Join our [public conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw).
-* Check our [news](docs/news.md).
-* Check our [presentation](https://doi.org/10.5281/zenodo.7871070) and [Forbes article](https://www.forbes.com/sites/karlfreund/2023/04/05/nvidia-performance-trounces-all-competitors-who-have-the-guts-to-submit-to-mlperf-inference-30/?sh=3c38d2866676) about our development plans.
-* Read about our [CK concept (previous version before MLCommons)](https://arxiv.org/abs/2011.01149).
+deps:
+- tags: detect,os
+- tags: get,sys-utils-cm
+- names:
+  - python
+  tags: get,python3
 
-### Copyright
+tags:
+- print
+- hello-world
+- python
 
-2021-2023 [MLCommons](https://mlcommons.org)
+```
 
-### License
+Our goal is to let the community start using CM within minutes!
 
-[Apache 2.0](LICENSE.md)
+#### Run MLPerf benchmarks out-of-the-box
+
+* [CM automation for the new MLPerf submitters](https://doi.org/10.5281/zenodo.8144274)
+* [MLPerf inference automation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference)
+* [Visualization of MLPerf results](https://access.cknowledge.org/playground/?action=experiments)
+
+#### Participate in reproducible AI/ML Systems optimization challenges
+
+We invite the community to participate in collaborative benchmarking and optimization of AI/ML systems:
+* [Community challenges (reproducibility, extension, benchmarking, optimization)](https://access.cknowledge.org/playground/?action=challenges)
+* [Shared benchmarking results for AI/ML Systems (performance, accuracy, power consumption, costs)](https://access.cknowledge.org/playground/?action=experiments) 
+* [Leaderboard](https://access.cknowledge.org/playground/?action=contributors)
+
+#### Reproduce results from ACM/IEEE/NeurIPS papers
+
+* [CM automation to reproduce results from ACM/IEEE MICRO'23 papers](https://github.com/ctuning/cm-reproduce-research-projects)
+* [CM automation to support Student Cluster Competition at SuperComputing'23](https://github.com/mlcommons/ck/blob/master/docs/tutorials/sc22-scc-mlperf.md)
+* [CM automation to reproduce IPOL paper](https://github.com/mlcommons/ck/blob/master/cm-mlops/script/reproduce-ipol-paper-2022-439/README-extra.md)
+
+
+
+### Project coordinators
+
+* [Grigori Fursin](https://cKnowledge.org/gfursin)
+* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh)
 
 ### Acknowledgments
 
-This project is currently supported by [MLCommons](https://mlcommons.org), [cTuning foundation](https://www.linkedin.com/company/ctuning-foundation),
-[cKnowledge](https://www.linkedin.com/company/cknowledge) and [individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md).
+Collective Knowledge Technology v3 (including Collective Mind automation language and Collective Knowledge playground)
+was developed from scratch by [Grigori Fursin](https://cKnowledge.org/gfursin) 
+and [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) in 2022-2023
+within the [MLCommons Task Force on Automation and Reproducibility](docs/taskforce.md)
+and with many great contributions from [the community](CONTRIBUTING.md).
+
+This project is supported by [MLCommons](https://mlcommons.org), 
+[cTuning foundation](https://cTuning.org),
+[cKnowledge.org](https://cKnowledge.org),
+and [individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md).
 We thank [HiPEAC](https://hipeac.net) and [OctoML](https://octoml.ai) for sponsoring initial development.
+