Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[codegen][gpu] Performance investigation of default convolution filter layout #20105

Open
jerryyin opened this issue Feb 26, 2025 · 0 comments
Open
Labels
enhancement ➕ New feature or request

Comments

@jerryyin
Copy link
Member

Request description

This is a medium/low priority follow-up of #19701.

By the end of the ticket, we decide to use FHWC layout for filter by default in the preprocessing pipeline #19974. This is due to:

  • For FHWC layout, the gemmK dimensions HWC will be collapsed as a single reduction dimension which is cleaner compared with HWFC layout
  • FHWC is a more common choice among different framework and libraries
  • FHWC and HWFC, in the limited tuning benchmarking, delivers around similar performance

However, the performance evaluation is not carried in a comprehensive manner and we may be able to adopt a better default layout in different scenarios. This ticket is used to track and further determine the right default layout for convolution in the preprocessing pipelines.

To finish this, we'd like a comprehensive study of the layout impact on convolution tuned performance and document our studies, alter the implementation of iree-preprocessing-convert-conv-filter-to-channels-last when necessary.

What component(s) does this issue relate to?

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ➕ New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant