Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please release a cuda build for v0.3.5 #1925

Open
ParisNeo opened this issue Feb 7, 2025 · 9 comments
Open

Please release a cuda build for v0.3.5 #1925

ParisNeo opened this issue Feb 7, 2025 · 9 comments

Comments

@ParisNeo
Copy link

ParisNeo commented Feb 7, 2025

Hi there. I see there is a metal build for v0.3.5. Would you please releasze a cuda version?

Best regards

@la1ty
Copy link

la1ty commented Feb 9, 2025

Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.

@Amrabdelhamed611
Copy link

Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.

how? is there reference for manual build ?

@la1ty
Copy link

la1ty commented Feb 9, 2025

The latest version is v0.3.7. You can follow the steps in the CI workflow.

For Windows users, here is my two cents:

  1. (Optional) Uninstall all MinGW tools (clang, gcc, etc.). Install everything.
  2. Install Visual Studio 2022 with MSVC 2022, Cmake and Windows SDK. If you need to build it with CUDA<12.4, you should also install MSVC 2019. (You may need to add the directory of cmake.exe to PATH manually. Make sure when you call cmake in powershell it uses the VS version cmake.exe.)
  3. Install CUDA.
  4. Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. (everything may be useful.)
  5. Git clone the repository with submodule llama.cpp.
  6. Activate the Python environment and run the following commands in PowerShell:
$env:CMAKE_ARGS = "-DGGML_CUDA=ON"
python -m pip install build wheel
python -m build --wheel

If you need to build it with CUDA<12.4, use MSVC 2019:

$env:CMAKE_ARGS = "-DGGML_CUDA=ON -DCMAKE_GENERATOR_TOOLSET=v142,host=x64,version=14.29"
python -m pip install build wheel
python -m build --wheel

@ParisNeo
Copy link
Author

ParisNeo commented Feb 9, 2025

@abetlen would you please add the workflow suggested by @la1ty to automate the generation of the builds as you release now versions?

@ZiyaCu
Copy link

ZiyaCu commented Feb 10, 2025

+1 for pre-built whl's

@Amrabdelhamed611
Copy link

@ZiyaCu @ParisNeo @la1ty , Check out this repo: textgen-webui release includes llama-cpp-python CUDA wheels.

The only downside is that these wheels can't be imported using import llama_cpp. Instead, you should use import llama_cpp_cuda or import llama_cpp_cuda_tensorcore, depending on the wheel you installed.

You can find the wheels in the requirements file:
🔗 Requirements.txt

Or check the full release here:
🔗 llama-cpp-python-cuBLAS-wheels Release

@ParisNeo
Copy link
Author

@Amrabdelhamed611 thanks alot.
I'll take a look.
I am using this in lollms which sould work on all kinds of systems, and it is a real pain having to write custom code for every configuration.

@dw5189
Copy link

dw5189 commented Feb 21, 2025

PS E:\llama-cpp-python> conda activate CUDA125-py312
(CUDA125-py312) PS E:\llama-cpp-python> $env:CMAKE_ARGS = "-DGGML_CUDA=ON"
(CUDA125-py312) PS E:\llama-cpp-python> python -m pip install build wheel
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/, http://mirrors.aliyun.com/pypi/simple/
Collecting build
Downloading http://mirrors.aliyun.com/pypi/packages/84/c2/80633736cd183ee4a62107413def345f7e6e3c01563dbca1417363cf957e/build-1.2.2.post1-py3-none-any.whl (22 kB)
Requirement already satisfied: wheel in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (0.45.1)
Requirement already satisfied: packaging>=19.1 in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (from build) (24.2)
Collecting pyproject_hooks (from build)
Downloading http://mirrors.aliyun.com/pypi/packages/bd/24/12818598c362d7f300f18e74db45963dbcb85150324092410c8b49405e42/pyproject_hooks-1.2.0-py3-none-any.whl (10 kB)
Collecting colorama (from build)
Downloading http://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Installing collected packages: pyproject_hooks, colorama, build
Successfully installed build-1.2.2.post1 colorama-0.4.6 pyproject_hooks-1.2.0
(CUDA125-py312) PS E:\llama-cpp-python> python -m build --wheel

  • Creating isolated environment: venv+pip...
  • Installing packages in isolated environment:
    • scikit-build-core[pyproject]>=0.9.2
  • Getting build dependencies for wheel...
  • Building wheel...
    *** scikit-build-core 0.10.7 using CMake 3.31.4 (wheel)
    *** Configuring CMake...
    2025-02-22 02:34:20,699 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
    loading initial cache file C:\Users\ADMINI~1\AppData\Local\Temp\tmpui8yd0_s\build\CMakeInit.txt
    -- Building for: Visual Studio 17 2022
    -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.26100.
    -- The C compiler identification is MSVC 19.43.34808.0
    -- The CXX compiler identification is MSVC 19.43.34808.0
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.47.1.windows.2")
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
    -- Looking for pthread_create in pthreads
    -- Looking for pthread_create in pthreads - not found
    -- Looking for pthread_create in pthread
    -- Looking for pthread_create in pthread - not found
    -- Found Threads: TRUE
    -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
    -- CMAKE_SYSTEM_PROCESSOR: AMD64
    -- CMAKE_GENERATOR_PLATFORM: x64
    -- Including CPU backend
    -- Found OpenMP_C: -openmp (found version "2.0")
    -- Found OpenMP_CXX: -openmp (found version "2.0")
    -- Found OpenMP: TRUE (found version "2.0")
    -- x86 detected
    -- Performing Test HAS_AVX_1
    -- Performing Test HAS_AVX_1 - Success
    -- Performing Test HAS_AVX2_1
    -- Performing Test HAS_AVX2_1 - Success
    -- Performing Test HAS_FMA_1
    -- Performing Test HAS_FMA_1 - Success
    -- Performing Test HAS_AVX512_1
    -- Performing Test HAS_AVX512_1 - Failed
    -- Performing Test HAS_AVX512_2
    -- Performing Test HAS_AVX512_2 - Failed
    -- Adding CPU backend variant ggml-cpu: /arch:AVX2 GGML_AVX2;GGML_FMA;GGML_F16C
    -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.5/include (found version "12.5.82")
    -- CUDA Toolkit found
    -- Using CUDA architectures: native
    CMake Error at D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:614 (message):
    No CUDA toolset found.
    Call Stack (most recent call first):
    D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
    D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
    D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCUDACompiler.cmake:131 (CMAKE_DETERMINE_COMPILER_ID)
    vendor/llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:25 (enable_language)

-- Configuring incomplete, errors occurred!

*** CMake configuration failed

ERROR Backend subprocess exited when trying to invoke build_wheel

@la1ty
Copy link

la1ty commented Feb 22, 2025

@dw5189 There are two possible causes I guess:

  1. Make sure you are using the VS version cmake.exe to compile this project. I run cmake --version in Powershell and it returns cmake version 3.29.5-msvc4. (I tried MinGW version and it failed. But currently the log seems normal, so good luck.)
  2. Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. If you don't know how to do it, just search No CUDA toolset found in any web search engine and it should return plenty of pages with details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants