Skip to content

Commit

Permalink
Update docs for IndySCC24
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh committed Oct 9, 2024
1 parent a11d1b7 commit 6918b85
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 23 deletions.
62 changes: 42 additions & 20 deletions docs/benchmarks/language/reproducibility/indyscc24-bert.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,40 +9,62 @@ hide:

This guide is designed for the [IndySCC 2024](https://sc24.supercomputing.org/students/indyscc/) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Bert Large](https://github.com/mlcommons/inference/tree/master/language/bert#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy.

For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Bert Large requires processing a minimum of 10833 samples in both performance and accuracy modes using the Squad v1.1 dataset. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring.
For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Bert Large requires processing a minimum of 10833 samples in both performance and accuracy modes using the Squad v1.1 dataset.

## Scoring

In the SCC, your first objective will be to run a reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score.

Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points.

Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system.
In the IndySCC 2024, your objective will be to run a reference (unoptimized) Python implementation of the MLPerf inference benchmark to complete a successful submission passing the submission checker. Only one of the available framework needs to be submitted.


!!! info
Both MLPerf and CM automation are evolving projects.
If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)

## Artifacts to submit to the SCC committee

You will need to submit the following files:

* `mlperf_submission_short.tar.gz` - automatically generated file with validated MLPerf results.
* `mlperf_submission_short_summary.json` - automatically generated summary of MLPerf results.
* `mlperf_submission_short.run` - CM commands to run MLPerf BERT inference benchmark saved to this file.
* `mlperf_submission_short.tstamps` - execution timestamps before and after CM command saved to this file.
* `mlperf_submission_short.md` - description of your platform and some highlights of the MLPerf benchmark execution.

All the needed files are automatically pushed to the GitHub repository if you manage to complete the given commands. No additional files need to be submitted.


=== "MLCommons-Python"
## MLPerf Reference Implementation in Python

{{ mlperf_inference_implementation_readme (4, "bert-99", "reference", extra_variation_tags=",_short", scenarios=["Offline"],categories=["Edge"], setup_tips=False) }}
{{ mlperf_inference_implementation_readme (4, "bert-99", "reference", extra_variation_tags="", scenarios=["Offline"],categories=["Edge"], setup_tips=False) }}

=== "Nvidia"
## Nvidia MLPerf Implementation
{{ mlperf_inference_implementation_readme (4, "bert-99", "nvidia", extra_variation_tags=",_short", scenarios=["Offline"],categories=["Edge"], setup_tips=False, implementation_tips=False) }}


## Submission Commands

### Generate actual submission tree

```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=edge \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--run_style=test \
--quiet \
--submitter=<Team Name>
```

* Use `--hw_name="My system name"` to give a meaningful system name.


### Push Results to GitHub

Fork the `mlperf-inference-results-scc24` branch of the repository URL at [https://github.com/mlcommons/cm4mlperf-inference](https://github.com/mlcommons/cm4mlperf-inference).

Run the following command after **replacing `--repo_url` with your GitHub fork URL**.

```bash
cm run script --tags=push,github,mlperf,inference,submission \
--repo_url=https://github.com/<myfork>/cm4mlperf-inference \
--repo_branch=mlperf-inference-results-scc24 \
--commit_message="Results on system <HW Name>" \
--quiet
```

Once uploaded give a Pull Request to the origin repository. Github action will be running there and once
finished you can see your submitted results at [https://docs.mlcommons.org/cm4mlperf-inference](https://docs.mlcommons.org/cm4mlperf-inference).
6 changes: 3 additions & 3 deletions docs/benchmarks/text_to_image/reproducibility/scc24.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,17 +80,17 @@ cm run script --tags=generate,inference,submission \

### Push Results to GitHub

Fork the repository URL at [https://github.com/gateoverflow/cm4mlperf-inference](https://github.com/gateoverflow/cm4mlperf-inference).
Fork the `mlperf-inference-results-scc24` branch of the repository URL at [https://github.com/mlcommons/cm4mlperf-inference](https://github.com/mlcommons/cm4mlperf-inference).

Run the following command after **replacing `--repo_url` with your GitHub fork URL**.

```bash
cm run script --tags=push,github,mlperf,inference,submission \
--repo_url=https://github.com/gateoverflow/cm4mlperf-inference \
--repo_url=https://github.com/<myfork>/cm4mlperf-inference \
--repo_branch=mlperf-inference-results-scc24 \
--commit_message="Results on system <HW Name>" \
--quiet
```

Once uploaded give a Pull Request to the origin repository. Github action will be running there and once
finished you can see your submitted results at [https://gateoverflow.github.io/cm4mlperf-inference](https://gateoverflow.github.io/cm4mlperf-inference).
finished you can see your submitted results at [https://docs.mlcommons.org/cm4mlperf-inference](https://docs.mlcommons.org/cm4mlperf-inference).

0 comments on commit 6918b85

Please sign in to comment.