-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: nvidia-smi not found
#515
Comments
@joerowell We can add it later after we merge this fork with upstream. |
@jataylo @micmelesse This seems to be related to the nvsmi related test failure. What is the status of that test? |
Hi, any update of rocm-smi version of |
We got around this in inductor by hard coding flops for specific arch when we required this, @zhanglx13 @micmelesse may have to consider writing amdsmi equivalents in triton. |
Can you give me the relevant code paths? |
@MARD1NO Not sure if this helps at all https://github.com/pytorch/pytorch/blob/main/torch/_utils_internal.py#L208 This how we had to get around the nvsmi call from triton previously, by hard coding max-clock rates for our gfx arches. Not ideal. |
Problem Description
The
estimate_matmul
functionality in Triton relies rather heavily on the underlying stats of the GPU. On CUDA platforms, this functionality is realised by callingnvidia-smi
and then parsing the results. I see that this code is still present in this fork of Triton:triton/python/triton/testing.py
Line 12 in 35edd6a
Would it be possible to get support added for
rocm-smi
here instead? This makes autotuning Triton kernels for GEMM etc much easier.Operating System
CPU
GPU
AMD Instinct MI300X
ROCm Version
ROCm 6.0.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: