This repository has been archived by the owner on Oct 11, 2024. It is now read-only.
forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Benchmarking : Prepare for GHA benchmark UI (#122)
SUMMARY: - Miscellaneous updates to benchmarking infrastructure to support Github benchmarking UI - cleanup configs - Remove default arguments. - Add a `description` field to the benchmarking scripts so we may communicate intent to the UI - Move benchmark_result.py to logging folder - Add gha_benchmark_logging script that consumed BenchmarkResult JSON and outputs a JSON that the Github Benchmark UI can understand. - Add a minimal_test.json config that can be used for infra testing - Make config list as nightly explicity and add the remote-push job to nightly for fair comparison TEST PLAN: Manual testing nm-benchmark manual trigger : https://github.com/neuralmagic/nm-vllm/actions/runs/8284500798 nightly manual trigger : https://github.com/neuralmagic/nm-vllm/actions/runs/8285535882 --------- Co-authored-by: Varun Sundar Rabindranath <[email protected]>
- Loading branch information
1 parent
ac9c9c8
commit feb86cd
Showing
16 changed files
with
318 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
neuralmagic/benchmarks/configs/minimal_test.json |
1 change: 1 addition & 0 deletions
1
.github/data/nm_benchmark_configs_list.txt → ...ata/nm_benchmark_nightly_configs_list.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
neuralmagic/benchmarks/configs/benchmark_serving.json | ||
neuralmagic/benchmarks/configs/benchmark_throughput.json | ||
neuralmagic/benchmarks/configs/benchmark_remote_push.json |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
{ | ||
"configs": [ | ||
{ | ||
"description": "Benchmark vllm serving", | ||
"models": [ | ||
"mistralai/Mistral-7B-Instruct-v0.2" | ||
], | ||
"use_all_available_gpus" : "", | ||
"max_model_lens": [ | ||
4096 | ||
], | ||
"sparsity": [], | ||
"script_name": "benchmark_serving", | ||
"script_args": { | ||
"nr-qps-pair_" : ["5,inf"], | ||
"dataset": [ | ||
"sharegpt" | ||
] | ||
} | ||
}, | ||
{ | ||
"description": "Benchmark vllm engine throughput - with dataset", | ||
"models": [ | ||
"mistralai/Mistral-7B-Instruct-v0.2" | ||
], | ||
"max_model_lens" : [4096], | ||
"script_name": "benchmark_throughput", | ||
"script_args": { | ||
"output-len": [ | ||
128 | ||
], | ||
"num-prompts": [ | ||
100 | ||
], | ||
"dataset" : [ | ||
"sharegpt" | ||
], | ||
"max-model-len" : [4096], | ||
"use-all-available-gpus_" : [] | ||
} | ||
} | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.