DevX: Improve error reporting for benchmark jobs. #8125

guangy10 · 2025-01-31T23:57:10Z

🐛 Describe the bug

As shown in an example run, there are two issues that make it difficult to understand the expected behavior and locate the exact error:

Unexpected job success status: Two configurations failed to export, yet the benchmark-on-device jobs were still marked as successful (highlighted in the red box). The expected behavior is that benchmark-on-device jobs should either not be scheduled or be marked as skipped/canceled if a dependent job, such as export, fails.
In the export-models job, the error in "Upload artifacts to S3" step is misleading. It should be marked as skipped or canceled, as the "Run script in the container" step failed to generate the .pte file (highlighted in the yellow box).

cc: @huydhn @cbilgin @digantdesai @kimishpatel

Versions

trunk

cc @huydhn @kirklandsign @shoumikhin @mergennachin @byjlw

The text was updated successfully, but these errors were encountered:

…6286) related to issue: pytorch/executorch#8125 example run: https://github.com/pytorch/executorch/actions/runs/13275229180/job/37063477575 Set defult warning for missing artifact ![image](https://github.com/user-attachments/assets/f627f63e-a960-4cf1-ad78-57ce91a06447) add test for no artifacts

guangy10 · 2025-02-19T19:11:52Z

@yangw-dev There is another tiny improvement we can make. As shown on the screenshot, there are two entries for "iPhone 15". They are actually different, one is running on iOS 18, and the other one is iOS 17. We should considering adding the major.minor os version to avoid confusion and improve the logging readability. cc: @huydhn

huydhn · 2025-02-19T20:46:32Z

Given that there is an issue where timing out job won't print anything. I'm tempted to split this part into 2 steps:

Run the benchmark
Print the spec outputs. The second step will also be done even if the first step timeout. This will also help make the spec output easier to view.

Thoughts?

guangy10 added enhancement module: benchmark labels Jan 31, 2025

huydhn added this to PyTorch OSS Dev Infra Feb 1, 2025

digantdesai added the triaged label Feb 3, 2025

guangy10 added this to ExecuTorch DevX Feb 4, 2025

guangy10 moved this to To triage in ExecuTorch DevX Feb 4, 2025

huydhn moved this to Cold Storage in PyTorch OSS Dev Infra Feb 4, 2025

guangy10 added the module: user experience label Feb 5, 2025

guangy10 added this to ExecuTorch Benchmark Feb 6, 2025

guangy10 assigned huydhn Feb 7, 2025

huydhn moved this to Ready in ExecuTorch Benchmark Feb 11, 2025

huydhn assigned yangw-dev and unassigned huydhn Feb 11, 2025

yangw-dev mentioned this issue Feb 13, 2025

set warning upload artifacts when check-artifact step return error pytorch/test-infra#6286

Merged

yangw-dev mentioned this issue Feb 13, 2025

[Benchmark] fail test if model artifact does not exist #8482

Merged

yangw-dev moved this from Cold Storage to In Progress in PyTorch OSS Dev Infra Feb 13, 2025

yangw-dev moved this from Ready to In Progress in ExecuTorch Benchmark Feb 13, 2025

yangw-dev closed this as completed in #8482 Mar 3, 2025

github-project-automation bot moved this from In Progress to Done in PyTorch OSS Dev Infra Mar 3, 2025

github-project-automation bot moved this from In Progress to Done in ExecuTorch Benchmark Mar 3, 2025

github-project-automation bot moved this from To triage to Done in ExecuTorch DevX Mar 3, 2025

yangw-dev reopened this Mar 3, 2025

github-project-automation bot moved this from Done to Backlog in ExecuTorch DevX Mar 3, 2025

guangy10 moved this from Done to In Progress in ExecuTorch Benchmark Mar 4, 2025

This was referenced Mar 5, 2025

[Fix bug] always run prepare-test-spec #8974

Closed

Revert "[Benchmark] fail test if model artifact does not exist" #8985

Merged

yangw-dev closed this as completed in #8985 Mar 6, 2025

github-project-automation bot moved this from Backlog to Done in ExecuTorch DevX Mar 6, 2025

github-project-automation bot moved this from In Progress to Done in ExecuTorch Benchmark Mar 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DevX: Improve error reporting for benchmark jobs. #8125

DevX: Improve error reporting for benchmark jobs. #8125

guangy10 commented Jan 31, 2025 •

edited by pytorch-bot bot

Loading

guangy10 commented Feb 19, 2025

huydhn commented Feb 19, 2025

DevX: Improve error reporting for benchmark jobs. #8125

DevX: Improve error reporting for benchmark jobs. #8125

Comments

guangy10 commented Jan 31, 2025 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

guangy10 commented Feb 19, 2025

huydhn commented Feb 19, 2025

guangy10 commented Jan 31, 2025 •

edited by pytorch-bot bot

Loading