-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug Report] ttnn.gcd doesn't support int32 #17771
Comments
@umalesTT , please take a look FYI @cmaryanTT |
Perhaps I could have a go at writing an optimised GCD implementation for int32, as a learning exercise? The first question I have is the best way to benchmark the existing implementation. Can I get a cycle count? Something like CUDA where I can record events on-device and then get the precise timings afterwards. |
@jasondavies - absolutely, try it out! Here's the documentation for our profiling tool: |
This is around 50x faster compared to the old/limited floating point implementation. Fixes tenstorrent#17771.
This is around 50x faster compared to the old/limited floating point implementation. Fixes tenstorrent#17771.
This is probably a wrong tag? @umalesTT is in Forge - training. |
Describe the bug
The official documentation states that ttnn.gcd only supports INT32. However, it's clear from testing that INT32 doesn't work properly.
Testing reveals that floating point inputs work as expected.
Note that the restriction on input values to the range [-1024, 1024] doesn't really make sense. Looking at the internal implementation, the comments say "limited precision in bfloat16 format decreases support for input to the range [-1024, 1024]". However, I believe bfloat16 actually has only 7 bits of significand precision. Maybe the comment was supposed to say "float16"?
In any case, I think the ideal fix would be to add support for int32, and extend the maximum number of iterations to cover the full int32 input range.
To Reproduce
Expected behavior
Output should match PyTorch.
Please complete the following environment information:
Additional context
The text was updated successfully, but these errors were encountered: