Adding a github.io documentaion site with CM commands for all major i…

…mplementations (#1701) * Support batch-size in llama2 run * Add Rclone-Cloudflare download instructions to README.md * Add Rclone-Cloudflare download instructiosn to README.md * Minor wording edit to README.md * Add Rclone-Cloudflare download instructions to README.md * Add Rclone-GDrive download instructions to README.md * Add new and old instructions to README.md * Tweak language in README.md * Language tweak in README.md * Minor language tweak in README.md * Fix typo in README.md * Count error when logging errors: submission_checker.py * Fixes #1648, restrict loadgen uncommitted error message to within the loadgen directory * Update test-rnnt.yml (#1688) Stopping the github action for rnnt * Added docs init Added github action for website publish Update benchmark documentation Update publish.yaml Update publish.yaml Update benchmark documentation Improved the submission documentation Fix taskname Removed unused images * Fix benchmark URLs * Fix links * Add _full variation to run commands * Added script flow diagram * Added docker setup command for CM, extra run options * Added support for docker options in the docs * Added --quiet to the CM run_cmds in docs --------- Co-authored-by: Nathan Wasson <[email protected]>
mlcommons · May 21, 2024 · 5e2539a · 5e2539a
1 parent 87ba8cb
commit 5e2539a
Show file tree

Hide file tree

Showing 25 changed files with 944 additions and 0 deletions.
diff --git a/.github/workflows/publish.yaml b/.github/workflows/publish.yaml
@@ -0,0 +1,33 @@
+# This is a basic workflow to help you get started with Actions
+
+name: Publish site
+
+
+on:
+  release:
+    types: [published]
+  push:
+    branches:
+      - master
+      - docs
+
+jobs:
+
+  publish:
+    name: Publish the site
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository normally
+        uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.11"
+
+      - name: Install Mkdocs
+        run: pip install -r docs/requirements.txt
+
+      - name: Run Mkdocs deploy
+        run: mkdocs gh-deploy --force
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,9 @@
+# Documentation Website for MLPerf Inference using the unified CM interface
+
+## Commands to get the website running locally
+```
+git clone https://github.com/GATEOverflow/cm4mlperf-inference
+cd cm4mlperf-inference
+pip install -r requirements.txt
+mkdocs serve
+```
diff --git a/docs/benchmarks/image_classification/resnet50.md b/docs/benchmarks/image_classification/resnet50.md
@@ -0,0 +1,68 @@
+# Image Classification using ResNet50 
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+    ResNet50 validation run uses the Imagenet 2012 validation dataset consisting of 50,000 images.
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,imagenet,validation -j
+    ```
+=== "Calibration"
+    ResNet50 calibration dataset consist of 500 images selected from the Imagenet 2012 validation dataset. There are 2 alternative options for the calibration dataset.
+
+    ### Get Calibration Dataset Using Option 1
+    ```
+    cm run script --tags=get,dataset,imagenet,calibration,_mlperf.option1 -j
+    ```
+    ### Get Calibration Dataset Using Option 2
+    ```
+    cm run script --tags=get,dataset,imagenet,calibration,_mlperf.option2 -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf ResNet50 Model
+
+=== "Tensorflow"
+
+    ### Tensorflow
+    ```
+    cm run script --tags=get,ml-model,resnet50,_tensorflow -j
+    ```
+=== "Onnx"
+
+    ### Onnx
+    ```
+    cm run script --tags=get,ml-model,resnet50,_onnx -j
+    ```
+
+## Benchmark Implementations
+=== "MLCommons-Python"
+    ### MLPerf Reference Implementation in Python
+
+{{ mlperf_inference_implementation_readme (4, "resnet50", "reference") }}
+
+=== "Nvidia"
+    ### Nvidia MLPerf Implementation
+
+{{ mlperf_inference_implementation_readme (4, "resnet50", "nvidia") }}
+
+=== "Intel"
+    ### Intel MLPerf Implementation
+
+{{ mlperf_inference_implementation_readme (4, "resnet50", "intel") }}
+
+=== "Qualcomm"
+    ### Qualcomm AI100 MLPerf Implementation
+
+{{ mlperf_inference_implementation_readme (4, "resnet50", "qualcomm") }}
+
+=== "MLCommon-C++"
+    ### MLPerf Modular Implementation in C++
+
+{{ mlperf_inference_implementation_readme (4, "resnet50", "cpp") }}
diff --git a/docs/benchmarks/index.md b/docs/benchmarks/index.md
@@ -0,0 +1,28 @@
+# MLPerf Inference Benchmarks
+
+Please visit the individual benchmark links to see the run commands using the unified CM interface.
+
+1. [Image Classification](image_classification/resnet50.md) using ResNet50 model and Imagenet-2012 dataset
+
+2. [Text to Image](text_to_image/sdxl.md) using Stable Diffusion model and Coco2014 dataset
+
+3. [Object Detection](object_detection/retinanet.md) using Retinanet model and OpenImages dataset
+
+4. [Image Segmentation](medical_imaging/3d-unet.md)  using 3d-unet model and KiTS19 dataset
+
+5. [Question Answering](language/bert.md) using Bert-Large model and Squad v1.1 dataset
+
+6. [Text Summarization](language/gpt-j.md) using GPT-J model and CNN Daily Mail dataset
+
+7. [Text Summarization](language/llama2-70b.md) using LLAMA2-70b model and OpenORCA dataset
+
+8. [Recommendation](recommendation/dlrm-v2.md) using DLRMv2 model and Criteo multihot dataset
+
+All the eight benchmarks can participate in the datacenter category.
+All the eight benchmarks except DLRMv2 and LLAMA2 and can participate in the edge category. 
+
+`bert`, `llama2-70b`, `dlrm_v2` and `3d-unet` has a high accuracy (99.9%) variant, where the benchmark run  must achieve a higher accuracy of at least `99.9%` of the FP32 reference model
+in comparison with the `99%` default accuracy requirement.
+
+The `dlrm_v2` benchmark has a high-accuracy variant only. If this accuracy is not met, the submission result can be submitted only to the open division.
+
diff --git a/docs/benchmarks/language/bert.md b/docs/benchmarks/language/bert.md
@@ -0,0 +1,73 @@
+# Question Answering using Bert-Large
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+    BERT validation run uses the SQuAD v1.1 dataset.
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,squad,validation -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf Bert-Large Model
+
+=== "Pytorch"
+
+    ### Pytorch
+    ```
+    cm run script --tags=get,ml-model,bert-large,_pytorch -j
+    ```
+=== "Onnx"
+
+    ### Onnx
+    ```
+    cm run script --tags=get,ml-model,bert-large,_onnx -j
+    ```
+=== "Tensorflow"
+
+    ### Tensorflow
+    ```
+    cm run script --tags=get,ml-model,bert-large,_tensorflow -j
+    ```
+
+## Benchmark Implementations
+=== "MLCommons-Python"
+    ### MLPerf Reference Implementation in Python
+
+    BERT-99
+{{ mlperf_inference_implementation_readme (4, "bert-99", "reference") }}
+
+    BERT-99.9
+{{ mlperf_inference_implementation_readme (4, "bert-99.9", "reference") }}
+
+=== "Nvidia"
+    ### Nvidia MLPerf Implementation
+
+    BERT-99
+{{ mlperf_inference_implementation_readme (4, "bert-99", "nvidia") }}
+
+    BERT-99.9
+{{ mlperf_inference_implementation_readme (4, "bert-99.9", "nvidia") }}
+
+=== "Intel"
+    ### Intel MLPerf Implementation
+    BERT-99
+{{ mlperf_inference_implementation_readme (4, "bert-99", "intel") }}
+
+    BERT-99.9
+{{ mlperf_inference_implementation_readme (4, "bert-99.9", "intel") }}
+
+=== "Qualcomm"
+    ### Qualcomm AI100 MLPerf Implementation
+
+    BERT-99
+{{ mlperf_inference_implementation_readme (4, "bert-99", "qualcomm") }}
+
+    BERT-99.9
+{{ mlperf_inference_implementation_readme (4, "bert-99.9", "qualcomm") }}
diff --git a/docs/benchmarks/language/gpt-j.md b/docs/benchmarks/language/gpt-j.md
@@ -0,0 +1,57 @@
+# Text Summarization using GPT-J
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+    GPT-J validation run uses the CNNDM dataset.
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,cnndm,validation -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf GPT-J Model
+
+=== "Pytorch"
+
+    ### Pytorch
+    ```
+    cm run script --tags=get,ml-model,gptj,_pytorch -j
+    ```
+
+## Benchmark Implementations
+=== "MLCommons-Python"
+    ### MLPerf Reference Implementation in Python
+
+    GPT-J-99
+{{ mlperf_inference_implementation_readme (4, "gptj-99", "reference") }}
+
+    GPTJ-99.9
+{{ mlperf_inference_implementation_readme (4, "gptj-99.9", "reference") }}
+
+=== "Nvidia"
+    ### Nvidia MLPerf Implementation
+
+    GPTJ-99
+{{ mlperf_inference_implementation_readme (4, "gptj-99", "nvidia") }}
+
+    GPTJ-99.9
+{{ mlperf_inference_implementation_readme (4, "gptj-99.9", "nvidia") }}
+
+=== "Intel"
+    ### Intel MLPerf Implementation
+    GPTJ-99
+{{ mlperf_inference_implementation_readme (4, "gptj-99", "intel") }}
+
+
+=== "Qualcomm"
+    ### Qualcomm AI100 MLPerf Implementation
+
+    GPTJ-99
+{{ mlperf_inference_implementation_readme (4, "gptj-99", "qualcomm") }}
+
diff --git a/docs/benchmarks/language/llama2-70b.md b/docs/benchmarks/language/llama2-70b.md
@@ -0,0 +1,52 @@
+# Text Summarization using LLAMA2-70b
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+    LLAMA2-70b validation run uses the Open ORCA dataset.
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,openorca,validation -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf LLAMA2-70b Model
+
+=== "Pytorch"
+
+    ### Pytorch
+    ```
+    cm run script --tags=get,ml-model,llama2-70b,_pytorch -j
+    ```
+
+## Benchmark Implementations
+=== "MLCommons-Python"
+    ### MLPerf Reference Implementation in Python
+
+    LLAMA2-70b-99
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99", "reference") }}
+
+    LLAMA2-70b-99.9
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99.9", "reference") }}
+
+=== "Nvidia"
+    ### Nvidia MLPerf Implementation
+
+    LLAMA2-70b-99
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99", "nvidia") }}
+
+    LLAMA2-70b-99.9
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99.9", "nvidia") }}
+
+
+=== "Qualcomm"
+    ### Qualcomm AI100 MLPerf Implementation
+
+    LLAMA2-70b-99
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99", "qualcomm") }}
+
diff --git a/docs/benchmarks/medical_imaging/3d-unet.md b/docs/benchmarks/medical_imaging/3d-unet.md
@@ -0,0 +1,60 @@
+# Medical Imaging using 3d-unet (KiTS 2019 kidney tumor segmentation task)
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+    3d-unet validation run uses the KiTS19 dataset performing [KiTS 2019](https://kits19.grand-challenge.org/) kidney tumor segmentation task
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,kits19,validation -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf 3d-unet Model
+
+=== "Pytorch"
+
+    ### Pytorch
+    ```
+    cm run script --tags=get,ml-model,3d-unet,_pytorch -j
+    ```
+=== "Onnx"
+
+    ### Onnx
+    ```
+    cm run script --tags=get,ml-model,3d-unet,_onnx -j
+    ```
+=== "Tensorflow"
+
+    ### Tensorflow
+    ```
+    cm run script --tags=get,ml-model,3d-unet,_tensorflow -j
+    ```
+
+## Benchmark Implementations
+=== "MLCommons-Python"
+    ### MLPerf Reference Implementation in Python
+
+    3d-unet-99.9    
+{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "reference") }}
+
+=== "Nvidia"
+    ### Nvidia MLPerf Implementation
+    3d-unet-99 
+{{ mlperf_inference_implementation_readme (4, "3d-unet-99", "nvidia") }}
+
+    3d-unet-99.9 
+{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "nvidia") }}
+
+=== "Intel"
+    ### Intel MLPerf Implementation
+    3d-unet-99
+{{ mlperf_inference_implementation_readme (4, "3d-unet-99", "intel") }}
+
+    3d-unet-99.9
+{{ mlperf_inference_implementation_readme (4, "3d-unet-99.9", "intel") }}