remove low-workload benchmarks that are flaky (#156)

Summary: This PR removes the following benchmarking runs, - Sparse 2:4 : prompts 300, qps 1 - Sparse 50 % : prompts 150, qps 0.5 Nightlies that reported flakiness / regression: - https://github.com/neuralmagic/nm-vllm/actions/runs/8444603631/job/23138489705 - https://github.com/neuralmagic/nm-vllm/actions/runs/8475302354/job/23229958270 - https://github.com/neuralmagic/nm-vllm/actions/runs/8486867771/job/23258658412 On the nightlies we trigger a perf. regression alert when the value change is more than 10%. For the cases being removed, the variance is in the range of [10% - 20%] Test: None Co-authored-by: Varun Sundar Rabindranath <[email protected]>
neuralmagic · Apr 1, 2024 · cbe584e · cbe584e · github-actions · Apr 2, 2024
1 parent ff768a8
commit cbe584e
Showing 1 changed file with 0 additions and 2 deletions.
diff --git a/neuralmagic/benchmarks/configs/benchmark_serving.json b/neuralmagic/benchmarks/configs/benchmark_serving.json
@@ -39,7 +39,6 @@
 			"script_name": "benchmark_serving",
 			"script_args": {
 				"nr-qps-pair_": [
-                                        "150,0.5",
                                         "300,1",
                                         "750,2.5",
                                         "1500,5"
@@ -63,7 +62,6 @@
 			"script_args": {
 				"nr-qps-pair_": [
                                         "150,0.5",
-                                        "300,1",
                                         "750,2.5",
                                         "1500,5"
 				],