Fixing llama-cpp-python package compilation error for ROCm acceleration of llama.cpp loader #6051

LeonardoSidney · 2024-05-25T03:46:51Z

LeonardoSidney
May 25, 2024

Hello, I was having problems using llama.cpp with AMD's ROCm and looking for some discussions like this, I ended up having problems compiling the current llama cpp python 0.2.75/0.2.76 package so in my research I found a group of people with the same compilation problem and I wanted to share a temporary fix.
Maybe from now on people will be able to find a definitive solution faster than my palliative.

I have a preference for using venv in these moments, so I will follow that line of reasoning.
At this point I choose to use python3.11 as an interpreter.

python3.11 -m venv venv

Next we need to activate local venv.

source venv/bin/activate

I prefer to install the stable version at this time, pytorch 2.3, because pytorch 2.4 rocm6.1 had problems to me.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0

Just install the dependencies normally.

pip install -r requirements_amd.txt

Here comes the "technical adaptation" that I call a "gambiarra". We need to uninstall the package that we are going to "replace", the idea is simple, if I try to update it will do more than install llama-cpp-python so this is easier for me who just discovered this solution and didn't see many people commenting on.

pip uninstall llama-cpp-python

I got this command by compiling the llama.cpp project itself which showed me that the problem was in the llama-cpp-python package. After looking in a few places about the problem that was happening when using the command:

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python

I got a reply on the llama-cpp-python forum about the compilation issue.

CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCMAKE_PREFIX_PATH=/opt/rocm" FORCE_CMAKE=1 pip install llama-cpp-python

So adapting the answer above based on the compilation instructions of the llama.cpp project for my gpu and the 0.2.75 package that is recommended for this project(the latest version of the package did not compile), it ended up like this:

CMAKE_ARGS="/opt/rocm/llvm/bin/clang HIP_PATH=/opt/rocm -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1100 -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.75

Maybe it was a little overdone, but that's what worked.

So it was just a matter of running the project.

python server.py --listen

Checking the option: "cpu" in the model loading settings for llama.cpp of text-generation-webui since during installation we replaced the llama-cpp-python package that the project uses to accelerate only via CPU.

Once this is done, we can see in the print below the text-generation-webui project being able to use rocm via llama.cpp through what we affectionately call gambiarra.

I'm just sharing a temporary solution, maybe I'll go into more depth later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing llama-cpp-python package compilation error for ROCm acceleration of llama.cpp loader #6051

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Fixing llama-cpp-python package compilation error for ROCm acceleration of llama.cpp loader #6051

LeonardoSidney May 25, 2024

Replies: 0 comments

LeonardoSidney
May 25, 2024