Skip to content

Latest commit

 

History

History
91 lines (72 loc) · 3.6 KB

setup.md

File metadata and controls

91 lines (72 loc) · 3.6 KB

Setting up For Intel CPU/GPU/Gaudi

This guide provides detailed steps for settting up for Intel CPU, Intel Data Center GPU or Intel Gaudi2.

Hardware and Software requirements

Hardware Requirements

Ensure your setup includes one of the following Intel hardware:

Software Requirements

  • Git
  • Conda
  • Docker

Setup

1. Prerequisites

For Intel GPU, ensure the Intel® oneAPI Base Toolkit is installed.

For Gaudi, ensure the SynapseAI SW stack and container runtime is installed.

2. Clone the repository and install dependencies.

git clone https://github.com/intel/llm-on-ray.git
cd llm-on-ray
conda create -n llm-on-ray python=3.9
conda activate llm-on-ray
For CPU:
pip install .[cpu] --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
For GPU:
pip install .[gpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

If DeepSpeed is enabled or doing distributed finetuing, oneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts:

source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl; print(torch_ccl.cwd)")/env/setvars.sh
For Gaudi:

Please use the Dockerfile to build the image. Alternatively, you can install the dependecies on a bare metal machine. In this case, please refer to here.

# Under dev/docker/
cd ./dev/docker
docker build \
    -f Dockerfile.habana ../../ \
    -t llm-on-ray:habana \
    --network=host \
    --build-arg http_proxy=${http_proxy} \
    --build-arg https_proxy=${https_proxy} \
    --build-arg no_proxy=${no_proxy}

After the image is built successfully, start a container:

# llm-on-ray mounting is necessary.
# Please replace /path/to/llm-on-ray with your actual path to llm-on-ray.
# Add -p HOST_PORT:8080 or --net host if using UI.
# Add --cap-add sys_ptrace to enable py-spy in container if you need to debug.
# Set HABANA_VISIBLE_DEVICES if multi-tenancy is needed, such as "-e HABANA_VISIBLE_DEVICES=0,1,2,3"
# For multi-tenancy, refer to https://docs.habana.ai/en/latest/PyTorch/Reference/PT_Multiple_Tenants_on_HPU/Multiple_Dockers_each_with_Single_Workload.html
docker run -it --runtime=habana --name="llm-ray-habana-demo" -v /path/to/llm-on-ray:/root/llm-on-ray -v /path/to/models:/models/in/container -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host llm-on-ray:habana 

3. Launch Ray cluster

Start the Ray head node using the following command.

ray start --head --node-ip-address 127.0.0.1 --dashboard-host='0.0.0.0' --dashboard-port=8265

Optionally, for a multi-node cluster, start Ray worker nodes:

ray start --address='127.0.0.1:6379'