Fixing llama-cpp-python package compilation error for ROCm acceleration of llama.cpp loader #6051
LeonardoSidney
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I was having problems using llama.cpp with AMD's ROCm and looking for some discussions like this, I ended up having problems compiling the current llama cpp python 0.2.75/0.2.76 package so in my research I found a group of people with the same compilation problem and I wanted to share a temporary fix.
Maybe from now on people will be able to find a definitive solution faster than my palliative.
I have a preference for using venv in these moments, so I will follow that line of reasoning.
At this point I choose to use python3.11 as an interpreter.
Next we need to activate local venv.
source venv/bin/activate
I prefer to install the stable version at this time, pytorch 2.3, because pytorch 2.4 rocm6.1 had problems to me.
Just install the dependencies normally.
Here comes the "technical adaptation" that I call a "gambiarra". We need to uninstall the package that we are going to "replace", the idea is simple, if I try to update it will do more than install llama-cpp-python so this is easier for me who just discovered this solution and didn't see many people commenting on.
I got this command by compiling the llama.cpp project itself which showed me that the problem was in the llama-cpp-python package. After looking in a few places about the problem that was happening when using the command:
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python
I got a reply on the llama-cpp-python forum about the compilation issue.
CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCMAKE_PREFIX_PATH=/opt/rocm" FORCE_CMAKE=1 pip install llama-cpp-python
So adapting the answer above based on the compilation instructions of the llama.cpp project for my gpu and the 0.2.75 package that is recommended for this project(the latest version of the package did not compile), it ended up like this:
CMAKE_ARGS="/opt/rocm/llvm/bin/clang HIP_PATH=/opt/rocm -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1100 -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.75
Maybe it was a little overdone, but that's what worked.
So it was just a matter of running the project.
Checking the option: "cpu" in the model loading settings for llama.cpp of text-generation-webui since during installation we replaced the llama-cpp-python package that the project uses to accelerate only via CPU.
Once this is done, we can see in the print below the text-generation-webui project being able to use rocm via llama.cpp through what we affectionately call gambiarra.
I'm just sharing a temporary solution, maybe I'll go into more depth later.
Beta Was this translation helpful? Give feedback.
All reactions