Setting up For Intel CPU/GPU/Gaudi

This guide provides detailed steps for settting up for Intel CPU, Intel Data Center GPU or Intel Gaudi2.

Hardware and Software requirements

Hardware Requirements

Ensure your setup includes one of the following Intel hardware:

CPU: Intel® 1st, 2nd, 3rd, and 4th Gen Xeon® Scalable Performance processor
GPU:

Product Name Launch Date Memory Size Xe-cores

Intel® Data Center GPU Max 1550 Q1'23 128 GB 128

Intel® Data Center GPU Max 1100 Q2'23 48 GB 56
Gaudi: Gaudi2

Software Requirements

Git
Conda
Docker

Setup

1. Prerequisites

For Intel GPU, ensure the Intel® oneAPI Base Toolkit is installed.

For Gaudi, ensure the SynapseAI SW stack and container runtime is installed.

2. Clone the repository and install dependencies.

git clone https://github.com/intel/llm-on-ray.git
cd llm-on-ray
conda create -n llm-on-ray python=3.9
conda activate llm-on-ray

For CPU:

pip install .[cpu] --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/

For GPU:

pip install .[gpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

If DeepSpeed is enabled or doing distributed finetuing, oneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts:

source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl; print(torch_ccl.cwd)")/env/setvars.sh

For Gaudi:

Please use the Dockerfile to build the image. Alternatively, you can install the dependecies on a bare metal machine. In this case, please refer to here.

# Under dev/docker/
cd ./dev/docker
docker build \
    -f Dockerfile.habana ../../ \
    -t llm-on-ray:habana \
    --network=host \
    --build-arg http_proxy=${http_proxy} \
    --build-arg https_proxy=${https_proxy} \
    --build-arg no_proxy=${no_proxy}

After the image is built successfully, start a container:

# llm-on-ray mounting is necessary.
# Please replace /path/to/llm-on-ray with your actual path to llm-on-ray.
# Add -p HOST_PORT:8080 or --net host if using UI.
# Add --cap-add sys_ptrace to enable py-spy in container if you need to debug.
# Set HABANA_VISIBLE_DEVICES if multi-tenancy is needed, such as "-e HABANA_VISIBLE_DEVICES=0,1,2,3"
# For multi-tenancy, refer to https://docs.habana.ai/en/latest/PyTorch/Reference/PT_Multiple_Tenants_on_HPU/Multiple_Dockers_each_with_Single_Workload.html
docker run -it --runtime=habana --name="llm-ray-habana-demo" -v /path/to/llm-on-ray:/root/llm-on-ray -v /path/to/models:/models/in/container -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host llm-on-ray:habana

3. Launch Ray cluster

Start the Ray head node using the following command.

ray start --head --node-ip-address 127.0.0.1 --dashboard-host='0.0.0.0' --dashboard-port=8265

Optionally, for a multi-node cluster, start Ray worker nodes:

ray start --address='127.0.0.1:6379'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

setup.md

setup.md

Setting up For Intel CPU/GPU/Gaudi

Hardware and Software requirements

Hardware Requirements

Software Requirements

Setup

1. Prerequisites

2. Clone the repository and install dependencies.

For CPU:

For GPU:

For Gaudi:

3. Launch Ray cluster

Product Name	Launch Date	Memory Size	Xe-cores
Intel® Data Center GPU Max 1550	Q1'23	128 GB	128
Intel® Data Center GPU Max 1100	Q2'23	48 GB	56

Files

setup.md

Latest commit

History

setup.md

File metadata and controls

Setting up For Intel CPU/GPU/Gaudi

Hardware and Software requirements

Hardware Requirements

Software Requirements

Setup

1. Prerequisites

2. Clone the repository and install dependencies.

For CPU:

For GPU:

For Gaudi:

3. Launch Ray cluster