[DevX] Support filtering benchmark configs in on-demand workflow #7639

guangy10 · 2025-01-14T01:52:46Z

#7349 introduced benchmark configs which automatically detect all added supported delegates/quants/dtypes for a given model. It works well for the automated nightly benchmark jobs, however, for the on-demand jobs, given a model, it will run all supported configs for a model even if the user would like to run for target config. This PR is improving the DevX by support filtering benchmark configs for on-demand workflow

For example,

python .ci/scripts/gather_benchmark_configs.py  --models mv3,dl3 --devices samsung_galaxy_s22 --configs qnn_q8

will generate the matrix for qnn_q8 even if the models support other configs such as xnnpack_q8:

::set-output name=benchmark_configs::{"include": [{"model": "mv3", "config": "qnn_q8", "device_name": "samsung_galaxy_s22", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa"}, {"model": "dl3", "config": "qnn_q8", "device_name": "samsung_galaxy_s22", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa"}]}

Trigger an on-demand benchmark via GitHub UI:

Unit Tests:

test_device_pools_contains_all_devices (__main__.TestGatehrBenchmarkConfigs) ... ok
test_extract_all_configs_android (__main__.TestGatehrBenchmarkConfigs) ... ok
test_extract_all_configs_ios (__main__.TestGatehrBenchmarkConfigs) ... ok
test_gather_benchmark_configs_cli (__main__.TestGatehrBenchmarkConfigs) ... ok
test_gather_benchmark_configs_cli_specified_configs (__main__.TestGatehrBenchmarkConfigs) ... ok
test_gather_benchmark_configs_cli_specified_configs_raise (__main__.TestGatehrBenchmarkConfigs) ... ok
test_generate_compatible_configs_llama_model (__main__.TestGatehrBenchmarkConfigs) ... ok
test_generate_compatible_configs_non_genai_model (__main__.TestGatehrBenchmarkConfigs) ... ok
test_generate_compatible_configs_quantized_llama_model (__main__.TestGatehrBenchmarkConfigs) ... ok
test_generate_compatible_configs_unknown_model (__main__.TestGatehrBenchmarkConfigs) ok
test_is_valid_huggingface_model_id_valid (__main__.TestGatehrBenchmarkConfigs) ... ok
test_set_output_no_github_env (__main__.TestGatehrBenchmarkConfigs) ... ok

----------------------------------------------------------------------
Ran 12 tests in 0.177s

OK

Test jobs:

pytorch-bot · 2025-01-14T01:52:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7639

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 26 Pending

As of commit b7390c2 with merge base 74aace6 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

guangy10 · 2025-01-18T01:07:26Z

@huydhn Could you help review this PR? Do you have any idea about the permission issue?

.ci/scripts/tests/test_gather_benchmark_configs.py::TestGatehrBenchmarkConfigs::test_gather_benchmark_configs_cli_specified_configs - AssertionError: 1 != 0 : Error: Traceback (most recent call last):
  File "/pytorch/executorch/.ci/scripts/gather_benchmark_configs.py", line 282, in <module>
    get_benchmark_configs()
  File "/pytorch/executorch/.ci/scripts/gather_benchmark_configs.py", line 278, in get_benchmark_configs
    set_output("benchmark_configs", json.dumps(benchmark_configs))
  File "/pytorch/executorch/.ci/scripts/gather_benchmark_configs.py", line 183, in set_output
    with open(str(os.getenv("GITHUB_OUTPUT")), "a") as env:
PermissionError: [Errno 13] Permission denied: '/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_b3721eeb-2e15-4cc1-999b-bdcdda100ae6'

guangy10 · 2025-01-18T01:42:54Z

Write to stdout in unit tests due to permission to access GITHUB_OUTPUT

facebook-github-bot · 2025-01-18T01:44:30Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

huydhn · 2025-01-18T01:48:35Z

Let me take a look at the PR.

For that unit test failures, let's try unset GITHUB_OUTPUT before running the test for the sake of testing. The problem is that unlike where gather_benchmark_configs script is run https://github.com/pytorch/executorch/blob/main/.github/workflows/apple-perf.yml#L86 which has access to GITHUB_OUTPUT variable under the same runner user, the unit test https://github.com/pytorch/executorch/blob/main/.github/workflows/_unittest.yml#L24 is run inside a Docker container. So, calling that from the test would be run under whatever user the container is using. You couldn't set GITHUB_OUTPUT inside a linux_job script anyway.

guangy10 · 2025-01-18T01:50:00Z

Let me take a look at the PR.

For that unit test failures, let's try unset GITHUB_OUTPUT before running the test for the sake of testing. The problem is that unlike where gather_benchmark_configs script is run https://github.com/pytorch/executorch/blob/main/.github/workflows/apple-perf.yml#L86 which has access to GITHUB_OUTPUT variable under the same runner user, the unit test https://github.com/pytorch/executorch/blob/main/.github/workflows/_unittest.yml#L24 is run inside a Docker container. So, calling that inside the test would be run under whatever user the container is using. You couldn't set GITHUB_OUTPUT inside the linux_job script anyway.

yeah, just updated the code to write to stdout instead if raising a permission exception. Rerunning the test now

guangy10 · 2025-01-18T02:37:13Z

The tests pass on Linux but failed on macOS runner. It seems like "::set-ouput" is not set on macOS runner.

Expected to have:

Using provided configs ['coreml_fp16', 'xnnpack_q8'] for model 'mv2'
Discovered all supported configs for model 'dl3': ['xnnpack_q8', 'coreml_fp16', 'mps']
Using provided configs ['coreml_fp16', 'xnnpack_q8'] for model 'dl3'
::set-output name=benchmark_configs::{"include": [{"model": "mv2", "config": "coreml_fp16", "device_name": "apple_iphone_15", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/3b5acd2e-92e2-4778-b651-7726bafe129d"}, {"model": "mv2", "config": "xnnpack_q8", "device_name": "apple_iphone_15", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/3b5acd2e-92e2-4778-b651-7726bafe129d"}, {"model": "dl3", "config": "coreml_fp16", "device_name": "apple_iphone_15", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/3b5acd2e-92e2-4778-b651-7726bafe129d"}, {"model": "dl3", "config": "xnnpack_q8", "device_name": "apple_iphone_15", "device_arn": "arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/3b5acd2e-92e2-4778-b651-7726bafe129d"}]}

but only got

Using provided configs ['coreml_fp16', 'xnnpack_q8'] for model 'mv2'
Discovered all supported configs for model 'dl3': ['xnnpack_q8', 'coreml_fp16', 'mps']
Using provided configs ['coreml_fp16', 'xnnpack_q8'] for model 'dl3'

And I can't reproduce it on my mac.

huydhn · 2025-01-20T20:58:26Z

The tests pass on Linux but failed on macOS runner. It seems like "::set-ouput" is not set on macOS runner.

I think I get it now. MacOS job doesn't run inside a Docker container, but on the host runner itself. So, the script has access to the GITHUB_OUTPUT environment variable. It passes locally because there is no such variable. I think the test need to take this into account and check both stdout and the content of GITHUB_OUTPUT (if it exists)

guangy10 · 2025-01-22T18:20:43Z

The tests pass on Linux but failed on macOS runner. It seems like "::set-ouput" is not set on macOS runner.

I think I get it now. MacOS job doesn't run inside a Docker container, but on the host runner itself. So, the script has access to the GITHUB_OUTPUT environment variable. It passes locally because there is no such variable. I think the test need to take this into account and check both stdout and the content of GITHUB_OUTPUT (if it exists)

I see, or I can simply making it a Linux-only test, which seems better as in all CI we are running this step on Linux only

facebook-github-bot · 2025-01-22T18:32:06Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-01-22T21:18:40Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

guangy10 · 2025-01-23T22:11:08Z

@kimishpatel @digantdesai Just fyi, subscribed you to DevX related work.

Support config filtering in ondemand benchmark flow Co-authored-by: Github Executorch <[email protected]>

…orch#7639) Support config filtering in ondemand benchmark flow Co-authored-by: Github Executorch <[email protected]>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2025

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 01:53 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 01:55 — with GitHub Actions Failure

guangy10 force-pushed the specified_config_ondemand branch from bac0256 to ed7f187 Compare January 14, 2025 01:56

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 01:56 — with GitHub Actions Failure

guangy10 temporarily deployed to upload-benchmark-results January 14, 2025 02:39 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results January 14, 2025 03:29 — with GitHub Actions Inactive

guangy10 force-pushed the specified_config_ondemand branch from ed7f187 to e7f89f5 Compare January 14, 2025 21:42

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 21:42 — with GitHub Actions Failure

guangy10 added the topic: not user facing label Jan 14, 2025

guangy10 requested a review from huydhn January 14, 2025 21:42

guangy10 marked this pull request as ready for review January 14, 2025 21:43

guangy10 marked this pull request as draft January 14, 2025 21:43

guangy10 force-pushed the specified_config_ondemand branch from e7f89f5 to 4fdd159 Compare January 14, 2025 22:02

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 22:02 — with GitHub Actions Failure

guangy10 force-pushed the specified_config_ondemand branch from 4fdd159 to e7ab81d Compare January 14, 2025 22:15

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 22:15 — with GitHub Actions Failure

guangy10 force-pushed the specified_config_ondemand branch from e7ab81d to 003e0ff Compare January 14, 2025 22:16

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 22:53 — with GitHub Actions Failure

guangy10 force-pushed the specified_config_ondemand branch from 003e0ff to 8335365 Compare January 14, 2025 23:02

guangy10 marked this pull request as ready for review January 14, 2025 23:07

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 23:23 — with GitHub Actions Failure

guangy10 had a problem deploying to upload-benchmark-results January 14, 2025 23:45 — with GitHub Actions Failure

guangy10 force-pushed the specified_config_ondemand branch from b0b615e to f91efef Compare January 18, 2025 01:41

guangy10 force-pushed the specified_config_ondemand branch from f91efef to 89a1daa Compare January 18, 2025 01:44

guangy10 temporarily deployed to upload-benchmark-results January 18, 2025 02:22 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results January 18, 2025 02:46 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results January 18, 2025 03:11 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results January 18, 2025 03:43 — with GitHub Actions Inactive

huydhn approved these changes Jan 20, 2025

View reviewed changes

guangy10 force-pushed the specified_config_ondemand branch from 89a1daa to 317dc11 Compare January 22, 2025 18:31

guangy10 temporarily deployed to upload-benchmark-results January 22, 2025 19:32 — with GitHub Actions Inactive

guangy10 force-pushed the specified_config_ondemand branch from 317dc11 to 77b2372 Compare January 22, 2025 21:14

Support config filtering in ondemand benchmark flow

b7390c2

guangy10 force-pushed the specified_config_ondemand branch from 77b2372 to b7390c2 Compare January 22, 2025 21:18

guangy10 temporarily deployed to upload-benchmark-results January 22, 2025 21:53 — with GitHub Actions Inactive

guangy10 temporarily deployed to upload-benchmark-results January 22, 2025 22:02 — with GitHub Actions Inactive

guangy10 merged commit f2720fa into main Jan 22, 2025
96 of 99 checks passed

guangy10 deleted the specified_config_ondemand branch January 22, 2025 22:02

guangy10 had a problem deploying to upload-benchmark-results January 22, 2025 23:46 — with GitHub Actions Failure

YIWENX14 pushed a commit that referenced this pull request Jan 28, 2025

[DevX] Support filtering benchmark configs in on-demand workflow (#7639)

a29d5d8

Support config filtering in ondemand benchmark flow Co-authored-by: Github Executorch <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DevX] Support filtering benchmark configs in on-demand workflow #7639

[DevX] Support filtering benchmark configs in on-demand workflow #7639

guangy10 commented Jan 14, 2025 •

edited

Loading

pytorch-bot bot commented Jan 14, 2025 •

edited

Loading

guangy10 commented Jan 18, 2025 •

edited

Loading

guangy10 commented Jan 18, 2025

facebook-github-bot commented Jan 18, 2025

huydhn commented Jan 18, 2025 •

edited

Loading

guangy10 commented Jan 18, 2025

guangy10 commented Jan 18, 2025

huydhn commented Jan 20, 2025

guangy10 commented Jan 22, 2025

facebook-github-bot commented Jan 22, 2025

facebook-github-bot commented Jan 22, 2025

guangy10 commented Jan 23, 2025

[DevX] Support filtering benchmark configs in on-demand workflow #7639

[DevX] Support filtering benchmark configs in on-demand workflow #7639

Conversation

guangy10 commented Jan 14, 2025 • edited Loading

pytorch-bot bot commented Jan 14, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7639

⏳ No Failures, 26 Pending

guangy10 commented Jan 18, 2025 • edited Loading

guangy10 commented Jan 18, 2025

facebook-github-bot commented Jan 18, 2025

huydhn commented Jan 18, 2025 • edited Loading

guangy10 commented Jan 18, 2025

guangy10 commented Jan 18, 2025

huydhn commented Jan 20, 2025

guangy10 commented Jan 22, 2025

facebook-github-bot commented Jan 22, 2025

facebook-github-bot commented Jan 22, 2025

guangy10 commented Jan 23, 2025

guangy10 commented Jan 14, 2025 •

edited

Loading

pytorch-bot bot commented Jan 14, 2025 •

edited

Loading

guangy10 commented Jan 18, 2025 •

edited

Loading

huydhn commented Jan 18, 2025 •

edited

Loading