-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[codegen][gpu] Adding conv filter layout fhwc to preprocessing pipeline #19974
Conversation
Hmm looks like the regression tests for SDXL had a couple of runtime failures. That's a bit surprising to me. I'll rerun the jobs and see if it reproduces. |
I think there is numeric change
|
Let me rebase, the base branch is slightly older now |
0cc26c5
to
0be3371
Compare
I've got some nice suggestions about how to triage those model failures offline from @Max191 and @nirvedhmeshram, WIP on this. Summary of failures is indicating:
|
…d_to_intrinsics on convolution (#20073) The `pad_to_intrinsics` pass only support `linalg.conv2d` op with `nhwc_hwcf` layout of convolution. This has created inconvenience around taking advantage of other convolution variants for their performance potentials. Once such scenario is the IR from `conv_filter_to_channels_last` will populate `conv2d_nhwc_fhwc` represented by `linalg.generic`. This PR extend support of the `pad_to_intrinsics` pass such that other convolution variants including: - Those that are represented with `linalg.generic` - Other layouts such as (filter layout of) `fhwc` `fchw` This PR will unblock #19974, and allow us to continue to use `pad_to_intrinsics` as igemm padding kernel catch up in performance. --------- Signed-off-by: jerryyin <[email protected]>
0be3371
to
bcc6d69
Compare
Signed-off-by: jerryyin <[email protected]>
Original mixed precision test is passing after rebase. Now |
3ef18c3
to
b281af1
Compare
Signed-off-by: jerryyin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR is follow-up to #19739 and #19798. In previous PRs, we allowed convolution filter to be converted according to pipeline options and lower through igemm lowering pipeline. As @Max191's tuning on convolution demonstrated the performance benefit of this pass, we decide to plumb the pass by default in this follow-up PR.
In this PR, we turn this pass on by default sdxl inference pipeline
iree-preprocessing-transpose-convolution-pipeline
. This will allow the pass to turn on in convolution related inferencing workloads. It will also pickfhwc
layout by default if no option is explicitly set forfilter-layout
.Credit to discussion from @Max191 and @nirvedhmeshram that we don't always want to turn this pass on because transpose on filter can still be an overhead in training workloads. This pass matches the location where
ConvertConvToChannelsLast
is.