Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
Browse files Browse the repository at this point in the history
  • Loading branch information
gfursin committed Sep 21, 2023
2 parents 9bb8ab6 + 9614c5d commit f9d27d9
Show file tree
Hide file tree
Showing 773 changed files with 17,367 additions and 4,068 deletions.
27 changes: 24 additions & 3 deletions .github/workflows/test-cm-tutorial-tvm-pip.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,29 @@ on:
- '!cm-mlops/**.md'

jobs:
build:
test_vm_runtime:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.9"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
- name: Test CM Tutorial TVM pip install with VirtualMachine Runtime
run: |
python tests/tutorials/test_tutorial_tvm_pip_vm.py
test_ge_runtime:
runs-on: ubuntu-latest
strategy:
fail-fast: false
Expand All @@ -31,6 +52,6 @@ jobs:
python -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
- name: Test CM Tutorial TVM pip install
- name: Test CM Tutorial TVM pip install with GraphExecutor Runtime
run: |
python tests/tutorials/test_tutorial_tvm_pip.py
python tests/tutorials/test_tutorial_tvm_pip_ge.py
106 changes: 35 additions & 71 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,91 +16,55 @@ Ensure that cla-bot and other checks pass for your Pull requests.

## Contributing to this project

We suggest you to join the [open MLCommons Collective Knowledge task force](docs/taskforce.md)
to learn how to use the CK technology, CM scripting language, enhance existing CM components for MLOps and DevOps,
run MLPerf benchmarks and contribute your own artifacts, scripts and workflows in the CM format.
Please join our [Discord server](https://discord.gg/JjWNWXKxwT)
to learn about how to use the CK technology v3 (including the MLCommons CM automation language, CK playground
and Modular Inference Library) or participate in collaborative developments.

Thank you for your support and looking forward to collaborating with you!

## Authors and maintainers
## Authors and project coordinators

* [Grigori Fursin](https://cKnowledge.org/gfursin)
* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh)
* [Grigori Fursin](https://cKnowledge.org/gfursin) ([cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org))
* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) ([cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org))

## Contributors in alphabetical order
## Contributors to the Collective Knowledge Technology v3 (CM automation language and CK playground) in alphabetical order

* Sam Ainsworth (University of Cambridge, UK)
* Saheli Bhattacharjee (@sahelib25)
* Resmi Arjun
* Ethan Cheng (Nvidia)
* Jiahao Chen (MIT)
* Gianfranco Costamagna
* Chris Cummins (Facebook)
* Valentin Dalibard <[email protected]>
* Alastair Donaldson <[email protected]>
* Thibaut Dumontet
* Himanshu Dutta
* Daniil Efremov (Xored)
* Leonid Fursin
* Todd Gamblin (LLNL)
* Chandan Reddy Gopal (ENS Paris)
* Leo Gordon (dividiti)
* Dave Greasley (University of Bristol)
* Herve Guillou
* Vincent Grevendonk (Arm)
* Ray DeMoss (One Stop Systems)
* Himanshu Dutta (Indian Institute of Technology)
* Justin Faust (One Stop Systems)
* Leonid Fursin (United Silicon Carbide)
* Michael Goin (Neural Magic)
* Christophe Guillon (STMicroelectronics)
* Sven van Haastregt (Arm)
* Michael Haidl
* Stephen Herbein (LLNL)
* Mehrdad Hessar (OctoML): support for TinyML automation
* Patrick Hesse (College of Saint Benedict and Saint John's University)
* Nikolay Istomin (Xored)
* Kenan Kalajdzic
* Jose Armando Hernandez (Paris Saclay University)
* Mehrdad Hessar (OctoML)
* Miro Hodak (AMD)
* Nino Jacob
* David Kanter (MLCommons)
* Yuriy Kashnikov
* Jason Knight (OctoML)
* Alexey Kravets (Arm)
* Michael Kruse <[email protected]>
* Andrei Lascu <[email protected]>
* Anton Lokhmotov
* Peter Mattson (Google)
* Graham Markall <[email protected]>
* Michael Mcgeagh (Arm)
* Abdul Wahid Memon <[email protected]>
* Ilya Kozulin (Deelvin)
* @makaveli10 (Collabora)
* Peter Mattson (Google, MLCommons)
* Kasper Mecklenburg (Arm)
* Thierry Moreau (OctoML)
* Sachin Mudaliyar
* Stanley Mwangi (Microsoft)
* Luigi Nardi
* Cedric Nugteren <[email protected]>
* Lucas Nussbaum (Universite de Lorraine)
* Ivan Ospiov (Xored)
* Lakshman Patel @Patel230
* Egor Pasko (Google)
* Ed Plowman (Arm)
* Lahiru Rasnayake (NTNU)
* Ashwin Nanjappa (Nvidia)
* Nandeeka Nayak (UIUC)
* Datta Nimmaturi (Nutanix)
* Lakshman Patel
* Vijay Janapa Reddi (Harvard University)
* Alex Redshaw (Arm)
* Vincent Rehm
* Toomas Remmelg (University of Edinburgh)
* Andrew Reusch (OctoML): support for TinyML automation
* Jarrett Revels (MIT)
* Jared Roesch (OctoML)
* Dmitry Savenko (Xored)
* Thomas Schmid (OctoML)
* Aditya Kumar Shaw
* Gavin Simpson (Arm)
* Aaron Smith (Microsoft)
* Andrew Reusch (OctoML)
* Anandhu S (Kerala Technical University)
* Warren Schultz (Principled Technologies)
* Amrutha Sheleenderan (Kerala Technical University)
* Byoungjun Seo (TTA)
* Aditya Kumar Shaw (Indian Institute of Science)
* Ilya Slavutin (Deelvin)
* Michel Steuwer (University of Edinburgh)
* Chloe Tessier (Illustrations for CK/CM presentations)
* Flavio Vella (Free University of Bozen-Bolzano)
* David Taufur (MLCommons)
* Chloe Tessier
* Gaurav Verma (Stony Brook University)
* Emanuele Vitali
* Dave Wilkinson (University of Pittsburgh)
* Sergey Yakushkin (Synopsys)
* Eiko Yoneki <[email protected]>
* Thomas Zhu (Oxford University) <[email protected]>
* @filven
* @ValouBambou

See more acknowledgments at the end of this [article](https://arxiv.org/abs/2011.01149).
* Haoyang Zhang (UIUC)
* Bojian Zheng (University of Toronto)
* Thomas Zhu (Oxford University)
156 changes: 106 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,72 +5,128 @@
[![CM test](https://github.com/mlcommons/ck/actions/workflows/test-cm.yml/badge.svg)](https://github.com/mlcommons/ck/actions/workflows/test-cm.yml)
[![CM script automation features test](https://github.com/mlcommons/ck/actions/workflows/test-cm-script-features.yml/badge.svg)](https://github.com/mlcommons/ck/actions/workflows/test-cm-script-features.yml)

### Documentation and the Getting Started Guide

[Table of contents](docs/README.md)

### About

The "Collective Knowledge" project (CK) is motivated by the [feedback from researchers and practitioners](https://learning.acm.org/techtalks/reproducibility)
while reproducing results from more than 150 research papers and validating them in the real world -
there is a need for a common and technology-agnostic framework
that can facilitate reproducible research and simplify technology transfer to production
across diverse and rapidly evolving software, hardware, models, and data.
It consists of the following sub-projects:
The [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md),
[cTuning foundation](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org)
are developing Collective Knowledge v3 - an open-source technology
enabling collaborative, reproducible, automated and unified benchmarking, optimization and comparison of AI, ML and other emerging workloads
across diverse and rapidly evolving models, data sets, software and hardware from different vendors.

Collective Knowledge v3 includes:
* [Non-intrusive, technology-agnostic and plugin-based Collective Mind automation language](cm)
* [Collective Knowledge Playground](https://access.cKnowledge.org)
* [Modular Inference Library](https://cknowledge.org/mil)

[The community](https://access.cknowledge.org/playground/?action=challenges) successfully validated CM automation language and CK playground to automate > 90% of all [MLPerf inference v3.1 results](https://mlcommons.org/en/news/mlperf-inference-storage-q323/)
and cross 10000 submissions in one round for the first time (submitted via [cTuning foundation](https://cTuning.org))!
Here is the [list of the new CM/CK capabilities](docs/news-mlperf-v3.1.md) available to everyone
to prepare and automate their future MLPerf submissions.

See related [HPC Wire'23 article](https://www.hpcwire.com/2023/09/13/mlperf-releases-latest-inference-results-and-new-storage-benchmark),
[ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339),
[ACM Tech Talk](https://learning.acm.org/techtalks/reproducibility)
and [MLPerf submitters orientation](https://doi.org/10.5281/zenodo.8144274)
to learn more about our open-source developments and long-term vision.

Join our [public Discord server](https://discord.gg/JjWNWXKxwT) to learn how to run and extend MLPerf benchmarks, participate in future MLPerf submissions,
automate reproducibility initiatives at ACM/IEEE/NeurIPS conferences and co-design efficient AI Systems.

### Documentation

* [Table of contents](docs/README.md)

### Upcoming events

* [CM automation language makes it easier to reproduce experiments from the accepted ACM/IEEE MICRO'23 papers](https://github.com/ctuning/cm-reproduce-research-projects/tree/main/script)
* [CK/CM authors will give a tutorial about CM automation language and CK playground at IISWC'23](https://iiswc.org/iiswc2023/#/program/)
* [CM automation language and CK playground will help students run MLPerf inference benchmark at the Student Cluster Competition at SuperComputing'23](https://sc23.supercomputing.org/students/student-cluster-competition)

*More events to come soon!*


### Some practical use cases

* [Collective Mind scripting language (MLCommons CM)](cm)
is intended to help researchers and practitioners
describe all the steps required to reproduce their experiments across any software, hardware, and data
in a common and technology-agnostic way.
It is powered by Python, JSON and/or YAML meta descriptions, and a unified CLI.
CM can automatically generate unified README and synthesize unified containers with a common API
while reducing all the tedious, manual, repetitive, and ad-hoc efforts to validate research projects in production.
It is used in the same way in native environments, Python virtual environments, and containers.
* [CM installation](https://github.com/mlcommons/ck/blob/master/docs/installation.md)
* [All CM tutorials](https://github.com/mlcommons/ck/blob/master/docs/tutorials)

See a few real-world examples of using the CM scripting language:
- [README to reproduce published IPOL'22 paper](cm-mlops/script/app-ipol-reproducibility-2022-439)
- [README to reproduce MLPerf RetinaNet inference benchmark at Student Cluster Competition'22](docs/tutorials/sc22-scc-mlperf.md)
- [Auto-generated READMEs to reproduce official MLPerf BERT inference benchmark v3.0 submission with a model from the Hugging Face Zoo](https://github.com/mlcommons/submissions_inference_3.0/tree/main/open/cTuning/code/huggingface-bert/README.md)
- [Auto-generated Docker containers to run and reproduce MLPerf inference benchmark](cm-mlops/script/app-mlperf-inference/dockerfiles/retinanet)
#### Run Python Hello World app

* [Collective Mind scripts (MLCommons CM scripts)](cm-mlops/script)
provide a low-level implementation of the high-level and technology-agnostic CM language.
```bash
python3 -m pip install cmind
# restart bash to add cm and cmr binaries to PATH

* [Collective Knowledge platform (MLCommons CK playground)](platform)
aggregates [reproducible experiments](https://access.cknowledge.org/playground/?action=experiments)
in the CM format, connects academia and industry to
[organize benchmarking, reproducibility, replicability and optimization challenges]( https://github.com/mlcommons/ck/tree/master/cm-mlops/challenge ),
and help developers and users select Pareto-optimal end-to-end applications and systems based on their requirements and constraints
(cost, performance, power consumption, accuracy, etc).
cm pull repo mlcommons@ck
cm run script --tags=print,python,hello-world
cmr "print python hello-world"
```

This CM script is a simple wrapper to native scripts and tools
described by a simple declarative YAML configuration file
specifying inputs, environment variables and dependencies on other portable
and shared [CM scripts](https://github.com/mlcommons/ck/tree/master/cm-mlops/script):

### Collaborative development
```yaml
alias: print-hello-world-py
uid: d83274c7eb754d90

This open-source technology is being developed by the public
[MLCommons task force on automation and reproducibility](docs/taskforce.md)
led by [Grigori Fursin](https://cKnowledge.org/gfursin) and
[Arjun Suresh](https://www.linkedin.com/in/arjunsuresh).
The goal is to connect academia and industry to develop, benchmark, compare, synthesize,
and deploy Pareto-efficient AI and ML systems and applications
(optimal trade off between performance, accuracy, power consumption, and price)
in a unified, automated and reproducible way while slashing all development and operational costs.
automation_alias: script
automation_uid: 5b4e0237da074764

* Join our [public Discord server](https://discord.gg/JjWNWXKxwT).
* Join our [public conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw).
* Check our [news](docs/news.md).
* Check our [presentation](https://doi.org/10.5281/zenodo.7871070) and [Forbes article](https://www.forbes.com/sites/karlfreund/2023/04/05/nvidia-performance-trounces-all-competitors-who-have-the-guts-to-submit-to-mlperf-inference-30/?sh=3c38d2866676) about our development plans.
* Read about our [CK concept (previous version before MLCommons)](https://arxiv.org/abs/2011.01149).
deps:
- tags: detect,os
- tags: get,sys-utils-cm
- names:
- python
tags: get,python3

### Copyright
tags:
- print
- hello-world
- python

2021-2023 [MLCommons](https://mlcommons.org)
```
### License
Our goal is to let the community start using CM within minutes!
[Apache 2.0](LICENSE.md)
#### Run MLPerf benchmarks out-of-the-box
* [CM automation for the new MLPerf submitters](https://doi.org/10.5281/zenodo.8144274)
* [MLPerf inference automation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference)
* [Visualization of MLPerf results](https://access.cknowledge.org/playground/?action=experiments)
#### Participate in reproducible AI/ML Systems optimization challenges
We invite the community to participate in collaborative benchmarking and optimization of AI/ML systems:
* [Community challenges (reproducibility, extension, benchmarking, optimization)](https://access.cknowledge.org/playground/?action=challenges)
* [Shared benchmarking results for AI/ML Systems (performance, accuracy, power consumption, costs)](https://access.cknowledge.org/playground/?action=experiments)
* [Leaderboard](https://access.cknowledge.org/playground/?action=contributors)
#### Reproduce results from ACM/IEEE/NeurIPS papers
* [CM automation to reproduce results from ACM/IEEE MICRO'23 papers](https://github.com/ctuning/cm-reproduce-research-projects)
* [CM automation to support Student Cluster Competition at SuperComputing'23](https://github.com/mlcommons/ck/blob/master/docs/tutorials/sc22-scc-mlperf.md)
* [CM automation to reproduce IPOL paper](https://github.com/mlcommons/ck/blob/master/cm-mlops/script/reproduce-ipol-paper-2022-439/README-extra.md)
### Project coordinators
* [Grigori Fursin](https://cKnowledge.org/gfursin)
* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh)
### Acknowledgments
This project is currently supported by [MLCommons](https://mlcommons.org), [cTuning foundation](https://www.linkedin.com/company/ctuning-foundation),
[cKnowledge](https://www.linkedin.com/company/cknowledge) and [individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md).
Collective Knowledge Technology v3 (including Collective Mind automation language and Collective Knowledge playground)
was developed from scratch by [Grigori Fursin](https://cKnowledge.org/gfursin)
and [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) in 2022-2023
within the [MLCommons Task Force on Automation and Reproducibility](docs/taskforce.md)
and with many great contributions from [the community](CONTRIBUTING.md).
This project is supported by [MLCommons](https://mlcommons.org),
[cTuning foundation](https://cTuning.org),
[cKnowledge.org](https://cKnowledge.org),
and [individual contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md).
We thank [HiPEAC](https://hipeac.net) and [OctoML](https://octoml.ai) for sponsoring initial development.
Loading

0 comments on commit f9d27d9

Please sign in to comment.