Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DevX: Improve error reporting for benchmark jobs. #8125

Closed
guangy10 opened this issue Jan 31, 2025 · 2 comments · Fixed by #8482 or #8985
Closed

DevX: Improve error reporting for benchmark jobs. #8125

guangy10 opened this issue Jan 31, 2025 · 2 comments · Fixed by #8482 or #8985
Assignees
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: benchmark Issues related to the benchmark infrastructure module: user experience Issues related to reducing friction for users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@guangy10
Copy link
Contributor

guangy10 commented Jan 31, 2025

🐛 Describe the bug

Image

As shown in an example run, there are two issues that make it difficult to understand the expected behavior and locate the exact error:

  1. Unexpected job success status: Two configurations failed to export, yet the benchmark-on-device jobs were still marked as successful (highlighted in the red box). The expected behavior is that benchmark-on-device jobs should either not be scheduled or be marked as skipped/canceled if a dependent job, such as export, fails.
  2. In the export-models job, the error in "Upload artifacts to S3" step is misleading. It should be marked as skipped or canceled, as the "Run script in the container" step failed to generate the .pte file (highlighted in the yellow box).

cc: @huydhn @cbilgin @digantdesai @kimishpatel

Versions

trunk

cc @huydhn @kirklandsign @shoumikhin @mergennachin @byjlw

@guangy10 guangy10 added enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: benchmark Issues related to the benchmark infrastructure labels Jan 31, 2025
@digantdesai digantdesai added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 3, 2025
@guangy10 guangy10 moved this to To triage in ExecuTorch DevX Feb 4, 2025
@huydhn huydhn moved this to Cold Storage in PyTorch OSS Dev Infra Feb 4, 2025
@guangy10 guangy10 added the module: user experience Issues related to reducing friction for users label Feb 5, 2025
@huydhn huydhn moved this to Ready in ExecuTorch Benchmark Feb 11, 2025
@huydhn huydhn assigned yangw-dev and unassigned huydhn Feb 11, 2025
@yangw-dev yangw-dev moved this from Cold Storage to In Progress in PyTorch OSS Dev Infra Feb 13, 2025
@yangw-dev yangw-dev moved this from Ready to In Progress in ExecuTorch Benchmark Feb 13, 2025
@guangy10
Copy link
Contributor Author

Image

@yangw-dev There is another tiny improvement we can make. As shown on the screenshot, there are two entries for "iPhone 15". They are actually different, one is running on iOS 18, and the other one is iOS 17. We should considering adding the major.minor os version to avoid confusion and improve the logging readability. cc: @huydhn

@huydhn
Copy link
Contributor

huydhn commented Feb 19, 2025

Given that there is an issue where timing out job won't print anything. I'm tempted to split this part into 2 steps:

  1. Run the benchmark
  2. Print the spec outputs. The second step will also be done even if the first step timeout. This will also help make the spec output easier to view.

Thoughts?

@github-project-automation github-project-automation bot moved this from In Progress to Done in PyTorch OSS Dev Infra Mar 3, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in ExecuTorch Benchmark Mar 3, 2025
@github-project-automation github-project-automation bot moved this from To triage to Done in ExecuTorch DevX Mar 3, 2025
@yangw-dev yangw-dev reopened this Mar 3, 2025
@github-project-automation github-project-automation bot moved this from Done to Backlog in ExecuTorch DevX Mar 3, 2025
@guangy10 guangy10 moved this from Done to In Progress in ExecuTorch Benchmark Mar 4, 2025
@github-project-automation github-project-automation bot moved this from Backlog to Done in ExecuTorch DevX Mar 6, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in ExecuTorch Benchmark Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: benchmark Issues related to the benchmark infrastructure module: user experience Issues related to reducing friction for users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Done
Status: Done
Status: Done
4 participants