Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AOT perf regression on Apple platform #7967

Open
guangy10 opened this issue Jan 27, 2025 · 1 comment
Open

AOT perf regression on Apple platform #7967

guangy10 opened this issue Jan 27, 2025 · 1 comment
Labels
module: ci Issues related to continuous integration triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@guangy10
Copy link
Contributor

🐛 Describe the bug

Over the weekend I start observing timeout when exporting models.

Affected models:

  • mv3
  • mv2
    Affected configs:
  • xnnpack_q8
  • coreml_fp16

Affected platforms:

  • MocOS only.

Expected:
The export jobs used to finish within 30mins

Actual:
Some jobs start hitting the timeout threshold 60min since 1/25
1/26
- mv3, xnnpack_q8: 60min: https://github.com/pytorch/executorch/actions/runs/12980356644/job/36197276541
- mv3, coreml_fp16: 60min: https://github.com/pytorch/executorch/actions/runs/12980356644/job/36197276201
1/25
- mv3 xnnpack_q8: 60min: https://github.com/pytorch/executorch/actions/runs/12970194466/job/36175156930
- mv2 xnnpack_q8: 60min: https://github.com/pytorch/executorch/actions/runs/12970194466/job/36175156520

View historical runs of (mv3, xnnpack_q8) on macOS: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=apple-perf%20%2F%20export-models%20(mv3%2C%20xnnpack&mergeLF=true

View historical runs of (mv3, coreml_fp16) on macOS: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=apple-perf%20%2F%20export-models%20(mv3%2C%20coreml&mergeLF=true

Use this query to view all details from the scheduled runs: https://github.com/pytorch/executorch/actions/workflows/apple-perf.yml?query=event%3Aschedule

Versions

trunk

@guangy10
Copy link
Contributor Author

guangy10 commented Jan 27, 2025

There isn't much data about how long each step takes. To understand where the slowness comes from, we probably can start adding timer to each step.

@manuelcandales manuelcandales added module: ci Issues related to continuous integration triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: ci Issues related to continuous integration triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

2 participants