Please release a cuda build for v0.3.5 #1925

ParisNeo · 2025-02-07T22:09:22Z

Hi there. I see there is a metal build for v0.3.5. Would you please releasze a cuda version?

Best regards

la1ty · 2025-02-09T06:43:39Z

Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.

Amrabdelhamed611 · 2025-02-09T10:16:58Z

Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.

how? is there reference for manual build ?

la1ty · 2025-02-09T10:40:16Z

The latest version is v0.3.7. You can follow the steps in the CI workflow.

For Windows users, here is my two cents:

(Optional) Uninstall all MinGW tools (clang, gcc, etc.). Install everything.
Install Visual Studio 2022 with MSVC 2022, Cmake and Windows SDK. If you need to build it with CUDA<12.4, you should also install MSVC 2019. (You may need to add the directory of cmake.exe to PATH manually. Make sure when you call cmake in powershell it uses the VS version cmake.exe.)
Install CUDA.
Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. (everything may be useful.)
Git clone the repository with submodule llama.cpp.
Activate the Python environment and run the following commands in PowerShell:

$env:CMAKE_ARGS = "-DGGML_CUDA=ON"
python -m pip install build wheel
python -m build --wheel

If you need to build it with CUDA<12.4, use MSVC 2019:

$env:CMAKE_ARGS = "-DGGML_CUDA=ON -DCMAKE_GENERATOR_TOOLSET=v142,host=x64,version=14.29"
python -m pip install build wheel
python -m build --wheel

ParisNeo · 2025-02-09T12:16:14Z

@abetlen would you please add the workflow suggested by @la1ty to automate the generation of the builds as you release now versions?

ZiyaCu · 2025-02-10T14:12:45Z

+1 for pre-built whl's

Amrabdelhamed611 · 2025-02-12T13:27:55Z

@ZiyaCu @ParisNeo @la1ty , Check out this repo: textgen-webui release includes llama-cpp-python CUDA wheels.

The only downside is that these wheels can't be imported using import llama_cpp. Instead, you should use import llama_cpp_cuda or import llama_cpp_cuda_tensorcore, depending on the wheel you installed.

You can find the wheels in the requirements file:
🔗 Requirements.txt

Or check the full release here:
🔗 llama-cpp-python-cuBLAS-wheels Release

ParisNeo · 2025-02-12T13:31:16Z

@Amrabdelhamed611 thanks alot.
I'll take a look.
I am using this in lollms which sould work on all kinds of systems, and it is a real pain having to write custom code for every configuration.

dw5189 · 2025-02-21T18:35:29Z

PS E:\llama-cpp-python> conda activate CUDA125-py312
(CUDA125-py312) PS E:\llama-cpp-python> $env:CMAKE_ARGS = "-DGGML_CUDA=ON"
(CUDA125-py312) PS E:\llama-cpp-python> python -m pip install build wheel
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/, http://mirrors.aliyun.com/pypi/simple/
Collecting build
Downloading http://mirrors.aliyun.com/pypi/packages/84/c2/80633736cd183ee4a62107413def345f7e6e3c01563dbca1417363cf957e/build-1.2.2.post1-py3-none-any.whl (22 kB)
Requirement already satisfied: wheel in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (0.45.1)
Requirement already satisfied: packaging>=19.1 in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (from build) (24.2)
Collecting pyproject_hooks (from build)
Downloading http://mirrors.aliyun.com/pypi/packages/bd/24/12818598c362d7f300f18e74db45963dbcb85150324092410c8b49405e42/pyproject_hooks-1.2.0-py3-none-any.whl (10 kB)
Collecting colorama (from build)
Downloading http://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: pyproject_hooks, colorama, build
Successfully installed build-1.2.2.post1 colorama-0.4.6 pyproject_hooks-1.2.0
(CUDA125-py312) PS E:\llama-cpp-python> python -m build --wheel

Creating isolated environment: venv+pip...
Installing packages in isolated environment:
- scikit-build-core[pyproject]>=0.9.2
Getting build dependencies for wheel...
Building wheel...
*** scikit-build-core 0.10.7 using CMake 3.31.4 (wheel)
*** Configuring CMake...
2025-02-22 02:34:20,699 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
loading initial cache file C:\Users\ADMINI~1\AppData\Local\Temp\tmpui8yd0_s\build\CMakeInit.txt
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.26100.
-- The C compiler identification is MSVC 19.43.34808.0
-- The CXX compiler identification is MSVC 19.43.34808.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.47.1.windows.2")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM: x64
-- Including CPU backend
-- Found OpenMP_C: -openmp (found version "2.0")
-- Found OpenMP_CXX: -openmp (found version "2.0")
-- Found OpenMP: TRUE (found version "2.0")
-- x86 detected
-- Performing Test HAS_AVX_1
-- Performing Test HAS_AVX_1 - Success
-- Performing Test HAS_AVX2_1
-- Performing Test HAS_AVX2_1 - Success
-- Performing Test HAS_FMA_1
-- Performing Test HAS_FMA_1 - Success
-- Performing Test HAS_AVX512_1
-- Performing Test HAS_AVX512_1 - Failed
-- Performing Test HAS_AVX512_2
-- Performing Test HAS_AVX512_2 - Failed
-- Adding CPU backend variant ggml-cpu: /arch:AVX2 GGML_AVX2;GGML_FMA;GGML_F16C
-- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.5/include (found version "12.5.82")
-- CUDA Toolkit found
-- Using CUDA architectures: native
CMake Error at D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:614 (message):
No CUDA toolset found.
Call Stack (most recent call first):
D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCUDACompiler.cmake:131 (CMAKE_DETERMINE_COMPILER_ID)
vendor/llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:25 (enable_language)

-- Configuring incomplete, errors occurred!

*** CMake configuration failed

ERROR Backend subprocess exited when trying to invoke build_wheel

la1ty · 2025-02-22T08:02:01Z

@dw5189 There are two possible causes I guess:

Make sure you are using the VS version cmake.exe to compile this project. I run cmake --version in Powershell and it returns cmake version 3.29.5-msvc4. (I tried MinGW version and it failed. But currently the log seems normal, so good luck.)
Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. If you don't know how to do it, just search No CUDA toolset found in any web search engine and it should return plenty of pages with details.

la1ty mentioned this issue Feb 15, 2025

I cannot install this package with cuda12.4 #1933

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please release a cuda build for v0.3.5 #1925

Please release a cuda build for v0.3.5 #1925

ParisNeo commented Feb 7, 2025

la1ty commented Feb 9, 2025

Amrabdelhamed611 commented Feb 9, 2025

la1ty commented Feb 9, 2025 •

edited

Loading

ParisNeo commented Feb 9, 2025 •

edited

Loading

ZiyaCu commented Feb 10, 2025

Amrabdelhamed611 commented Feb 12, 2025

ParisNeo commented Feb 12, 2025

dw5189 commented Feb 21, 2025

la1ty commented Feb 22, 2025 •

edited

Loading

Please release a cuda build for v0.3.5 #1925

Please release a cuda build for v0.3.5 #1925

Comments

ParisNeo commented Feb 7, 2025

la1ty commented Feb 9, 2025

Amrabdelhamed611 commented Feb 9, 2025

la1ty commented Feb 9, 2025 • edited Loading

ParisNeo commented Feb 9, 2025 • edited Loading

ZiyaCu commented Feb 10, 2025

Amrabdelhamed611 commented Feb 12, 2025

ParisNeo commented Feb 12, 2025

dw5189 commented Feb 21, 2025

la1ty commented Feb 22, 2025 • edited Loading

la1ty commented Feb 9, 2025 •

edited

Loading

ParisNeo commented Feb 9, 2025 •

edited

Loading

la1ty commented Feb 22, 2025 •

edited

Loading