-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: Installing ROCm Flash-Attention on RHEL #69
Comments
Were you able to find a solution or workaround for this issue? Facing the same error with torch.version = 2.4.0+rocm6.1 Used to install and make use of flash-attn sometime back with Navi32. |
Install ROCm devel packages first (https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-overview.html, |
Thanks for responding. I uninstalled current rocm package completely rebooted and reinstalled it. Went back from 6.2.0 to 6.1.0 |
Hi @varshaprasad96. Were you able to resolve your issue? If so, please close the ticket. Thanks! |
Problem Description
We are trying to install ROCm flash-attention on RHEL using steps similar to those mentioned in the Dockerfile, but using a RHEL/UBI9 base image (registry.access.redhat.com/ubi9:latest) instead of rocm/pytorch.
As a prerequisite, the Dockerfile installs setuptools, packaging, ninja, and torch from https://download.pytorch.org/whl/rocm6.0, as recommended on the PyTorch website and the README of the repository.
Here are the versions:
Python: 3.11
Setuptools: 71.1.0
Torch: 2.1.1
The intention is to install the flash-attention successfully for ROCm version 6.1.2.
However, these are the following issues:
The
hipify.py
script has been modified in recent versions of Torch, causing the patch command to fail. The Dockerfile references this command: #flash-attention/Dockerfile.rocm
Line 27 in 2554f49
It looks like the version of hipify.py in Torch 2.1.1 does not match the expected version for ROCm 6.1.2. Could you specify which version of Torch should be used with ROCm 6.1.2 to avoid this issue?
pip install
ErrorsWhen running
pip install .
, after cloning the repository and setting thePYTHON_SITE_PACKAGES
path, the following errors appear:There are 2 issues here:
2.1 Error with
setuptools
not being available, even though it is present in thePYTHON_SITE_PACKAGES
.2.2 Error with nvcc not present.
For (2.1):
Even after verifying the availability of
setuptools
in the expected location, setting the env var and also settingPYTHONPATH
the error still persists. Is there any way to identify where the shim that pip uses insetup.py
is looking at while installing. Also are there any specific version requirements that is being violated.For (2.2):
Tried setting
FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE
as suggested in this issue, assuming nvcc would not be required, but the problem still persists.Tl;dr; these are the major issues we need help with:
setup.py
errors?It would be helpful if anyone could provide guidance or help in resolving these issues. Thank you!
Operating System
RHEL/UBI9
CPU
NA
GPU
AMD Instinct MI300X, AMD Instinct MI300A
ROCm Version
ROCm 6.1.0
The text was updated successfully, but these errors were encountered: