Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add Wave Reduce Intrinsics for i32 type #111342

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

easyonaadit
Copy link

@easyonaadit easyonaadit commented Oct 7, 2024

Currently, wave wide reduction is supported for umin and umax operations only.
This patch extends the support for:
uadd, add, usub, sub, min, max, and, or, xor ops for i32 type.

Copy link

github-actions bot commented Oct 7, 2024

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@easyonaadit easyonaadit changed the title Wave reduce add intrinsic [AMDGPU] Extend Wave Reduce Intrinsics for add, sub, and, or, xor, min, max (Integer type) Oct 8, 2024
@easyonaadit easyonaadit force-pushed the wave-reduce-add-intrinsic branch 3 times, most recently from 16e26e7 to 724879b Compare October 9, 2024 11:48
@easyonaadit easyonaadit changed the title [AMDGPU] Extend Wave Reduce Intrinsics for add, sub, and, or, xor, min, max (Integer type) [AMDGPU] Extend Wave Reduce Intrinsics for i32 Oct 9, 2024
@easyonaadit easyonaadit changed the title [AMDGPU] Extend Wave Reduce Intrinsics for i32 [AMDGPU] Add Wave Reduce Intrinsics for i32 Oct 9, 2024
@easyonaadit easyonaadit changed the title [AMDGPU] Add Wave Reduce Intrinsics for i32 [AMDGPU] Add Wave Reduce Intrinsics for i32 type Oct 9, 2024
@easyonaadit easyonaadit force-pushed the wave-reduce-add-intrinsic branch 4 times, most recently from 490e21e to aa481b8 Compare October 21, 2024 05:52
@@ -2119,8 +2119,14 @@ class AMDGPUWaveReduce<LLVMType data_ty = llvm_anyint_ty> : Intrinsic<
],
[IntrNoMem, IntrConvergent, IntrWillReturn, IntrNoCallback, IntrNoFree, ImmArg<ArgIndex<1>>]>;

def int_amdgcn_wave_reduce_umin : AMDGPUWaveReduce;
def int_amdgcn_wave_reduce_umax : AMDGPUWaveReduce;
multiclass AMDGPUWaveReduceGenerator<list<string> Operations> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename Operations with WaveReduceOps ?

case AMDGPU::S_SUB_I32:
case AMDGPU::S_OR_B32:
case AMDGPU::S_XOR_B32:
return 0x00000000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literal values are not consistent form here (0, 0x00000000, 0xFFFFFFFF). Can we not query this using std::numeric_limits here also?

const TargetRegisterClass *WaveMaskRegClass = TRI->getWaveMaskRegClass();
const TargetRegisterClass *DstRegClass = MRI.getRegClass(DstReg);
Register ExecMask = MRI.createVirtualRegister(WaveMaskRegClass);
Register CountOfActiveLanesReg = MRI.createVirtualRegister(DstRegClass);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ActiveLanes ? In general, you are mangling type in your variable names everywhere. we can avoid that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants