Skip to content

Commit

Permalink
add more limitations to use subgroup
Browse files Browse the repository at this point in the history
  • Loading branch information
qjia7 committed Jan 2, 2025
1 parent dff1961 commit 8328e14
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,7 @@ Status MatMulNBits::ComputeInternal(onnxruntime::webgpu::ComputeContext& context
// TODO: Support output_number > 1. Some cases are failed when output_number > 1.
constexpr uint32_t output_number = 1;
const uint32_t tile_m = M > kMinMForTileOptimization ? 4 : 1;
const bool use_subgroup = context.Device().HasFeature(wgpu::FeatureName::Subgroups) && context.AdapterInfo().vendor == std::string_view{"intel"};
const bool use_subgroup = context.Device().HasFeature(wgpu::FeatureName::Subgroups) && context.AdapterInfo().vendor == std::string_view{"intel"} && components_a == 4 && block_size == 32;
MatMulNBitsProgram program{output_number, block_size, tile_m, gsl::narrow<int>(components_b), has_zero_points, use_subgroup};
if (M > kMinMForTileOptimization && block_size == 32) {
components = 1;
Expand Down

0 comments on commit 8328e14

Please sign in to comment.