Skip to content

Commit

Permalink
[GPU] Dont do large tensor tiling for linalg ops with no payload
Browse files Browse the repository at this point in the history
Signed-off-by: Nirvedh Meshram <[email protected]>
  • Loading branch information
nirvedhmeshram committed Feb 24, 2025
1 parent d6da252 commit 376020b
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,11 @@ static void processRegion(RewriterBase &rewriter, Region *region,
// significant computation anyway. Equivalent generics are still tiled
// as they typically arise organically. Fills in particular are almost
// never found on their own and will be fused when tiling if need be.
if (isa<linalg::TransposeOp, linalg::CopyOp, linalg::FillOp>(op)) {
// Additonally, if linalg.yeild is the only op in the linalg payload then it is
// a reshape/ broadcast type op and same expectations of them being
// carefully introduced holds. E.g, they could be transpose
// ops which got generalized by a reshape fusion pattern.
if (linalgOp.getBlock()->getOperations().size() == 1) {
continue;
}
tileToMaxVectorSize(rewriter, linalgOp, maxVectorSize);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,18 @@ func.func @no_tile_fill(%arg0: f32) -> tensor<64x256xf32> {
// CHECK-NOT: scf.for
// CHECK: %[[FILL:.+]] = linalg.fill
// CHECK: return %[[FILL]]


func.func @no_tile_reshape(%arg0: tensor<1x4x4x1x4x4xi32>) -> tensor<1x4x1x4x4x4xi32> {
%empty = tensor.empty() : tensor<1x4x1x4x4x4xi32>
%0 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d3, d2, d4, d5)>, affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d1, d2, d3, d4, d5)>], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "parallel"]} ins(%arg0 : tensor<1x4x4x1x4x4xi32>) outs(%empty : tensor<1x4x1x4x4x4xi32>) {
^bb0(%in: i32, %out: i32):
linalg.yield %in : i32
} -> tensor<1x4x1x4x4x4xi32>
return %0 : tensor<1x4x1x4x4x4xi32>
}

// CHECK-LABEL: func.func @no_tile_reshape
// CHECK-NOT: scf.for
// CHECK: %[[RESHAPE:.+]] = linalg.generic
// CHECK: return %[[RESHAPE]]

0 comments on commit 376020b

Please sign in to comment.