Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation App - Adding files to deploy Translation application on AMD GPU #1191

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
fe0e178
Add example for AudioQnA deploy in AMD ROCm (#1147)
artem-astafev Nov 20, 2024
0cfb71d
TranslationApp - add:
Nov 26, 2024
100dfbf
TranslationApp - add README file
Nov 26, 2024
e458aac
TranslationApp - fix Docker compose file and tests script
Nov 26, 2024
adfb587
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 26, 2024
8b0526b
TranslationApp - fix Docker compose file and tests script
Nov 26, 2024
a320484
Fix DocIndexRetriever CI error on Xeon (#1167)
minmin-intel Nov 21, 2024
78d6293
Fix Translation Manifest CI with MODEL_ID (#1169)
letonghan Nov 21, 2024
122dc7c
Adjustments for helm release change (#1173)
bjzhjing Nov 21, 2024
b0110c7
Fix code scanning alert no. 21: Uncontrolled data used in path expres…
myqi Nov 21, 2024
b621db2
Update the llm backend ports (#1172)
wangkl2 Nov 22, 2024
280030c
Limit the version of vllm to avoid dockers build failures. (#1183)
ZePan110 Nov 25, 2024
f87770c
Update set_env.sh
chyundunovDatamonsters Dec 17, 2024
186e7bc
Merge branch 'main' into feature/GenAIExample_TranslationApp_deploy_o…
chyundunovDatamonsters Dec 17, 2024
1a860b4
DBQnA - fix README for Translation
Dec 18, 2024
b1c7af5
Merge remote-tracking branch 'origin/feature/GenAIExample_Translation…
Dec 18, 2024
4b7e848
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2024
47893e7
DBQnA - fix README for Translation
Dec 18, 2024
c5e1c7d
Merge remote-tracking branch 'origin/feature/GenAIExample_Translation…
Dec 18, 2024
cb04511
Merge branch 'main' into feature/GenAIExample_TranslationApp_deploy_o…
xiguiw Dec 20, 2024
331916b
Merge branch 'main' of https://github.com/opea-project/GenAIExamples …
Dec 20, 2024
55faafc
Merge remote-tracking branch 'origin/feature/GenAIExample_Translation…
Dec 20, 2024
1391c25
Translation app - fix deploy on AMD
Dec 20, 2024
b8a393e
Merge branch 'main' into feature/GenAIExample_TranslationApp_deploy_o…
xiguiw Dec 23, 2024
7e4e30c
Merge branch 'main' into feature/GenAIExample_TranslationApp_deploy_o…
xiguiw Dec 26, 2024
8968559
Merge branch 'main' into feature/GenAIExample_TranslationApp_deploy_o…
xiguiw Dec 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions Translation/docker_compose/amd/gpu/rocm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Build and deploy Translation Application on AMD GPU (ROCm)

## Build images

### Build the LLM Docker Image

```bash
### Cloning repo
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps

### Build Docker image
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```

### Build the MegaService Docker Image

```bash
### Cloning repo
git clone https://github.com/opea-project/GenAIExamples
cd GenAIExamples/Translation/

### Build Docker image
docker build -t opea/translation:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
```

### Build the UI Docker Image

```bash
cd GenAIExamples/Translation/ui
### Build UI Docker image
docker build -t opea/translation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile .
```

## Deploy Translation Application

### Features of Docker compose for AMD GPUs

1. Added forwarding of GPU devices to the container TGI service with instructions:

```yaml
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
```

In this case, all GPUs are thrown. To reset a specific GPU, you need to use specific device names cardN and renderN.

For example:

```yaml
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/card0:/dev/dri/card0
- /dev/dri/render128:/dev/dri/render128
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
```

To find out which GPU device IDs cardN and renderN correspond to the same GPU, use the GPU driver utility

### Go to the directory with the Docker compose file

```bash
cd GenAIExamples/Translation/docker_compose/amd/gpu/rocm
```

### Set environments

In the file "GenAIExamples/Translation/docker_compose/amd/gpu/rocm/set_env.sh " it is necessary to set the required values. Parameter assignments are specified in the comments for each variable setting command

```bash
chmod +x set_env.sh
. set_env.sh
```

### Run services

```
docker compose up -d
```

# Validate the MicroServices and MegaService

## Validate TGI service

```bash
curl http://${TRANSLATION_HOST_IP}:${TRANSLATIONS_TGI_SERVICE_PORT}/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
```

## Validate LLM service

```bash
curl http://${TRANSLATION_HOST_IP}:9000/v1/chat/completions \
-X POST \
-d '{"query":"Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"}' \
-H 'Content-Type: application/json'
```

## Validate MegaService

```bash
curl http://${TRANSLATION_HOST_IP}:${TRANSLATION_BACKEND_SERVICE_PORT}/v1/translation -H "Content-Type: application/json" -d '{
"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

## Validate Nginx service

```bash
curl http://${TRANSLATION_HOST_IP}:${TRANSLATION_NGINX_PORT}/v1/translation \
-H "Content-Type: application/json" \
-d '{"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```
99 changes: 99 additions & 0 deletions Translation/docker_compose/amd/gpu/rocm/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
translation-tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
container_name: translation-tgi-service
ports:
- "${TRANSLATION_TGI_SERVICE_PORT:-8008}:80"
volumes:
- "/var/lib/GenAI/translation/data:/data"
letonghan marked this conversation as resolved.
Show resolved Hide resolved
shm_size: 8g
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TRANSLATION_TGI_LLM_ENDPOINT}
HUGGING_FACE_HUB_TOKEN: ${TRANSLATION_HUGGINGFACEHUB_API_TOKEN}
HUGGINGFACEHUB_API_TOKEN: ${TRANSLATION_HUGGINGFACEHUB_API_TOKEN}
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
ipc: host
command: --model-id ${TRANSLATION_LLM_MODEL_ID}
translation-llm:
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
container_name: translation-llm-tgi-server
depends_on:
- translation-tgi-service
ports:
- "9000:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TRANSLATION_TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${TRANSLATION_HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
restart: unless-stopped
translation-backend-server:
image: ${REGISTRY:-opea}/translation:${TAG:-latest}
container_name: translation-backend-server
depends_on:
- translation-tgi-service
- translation-llm
ports:
- "${TRANSLATION_BACKEND_SERVICE_PORT:-8888}:8888"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${TRANSLATION_MEGA_SERVICE_HOST_IP}
- LLM_SERVICE_HOST_IP=${TRANSLATION_LLM_SERVICE_HOST_IP}
ipc: host
restart: always
translation-ui-server:
image: ${REGISTRY:-opea}/translation-ui:${TAG:-latest}
container_name: translation-ui-server
depends_on:
- translation-backend-server
ports:
- "${TRANSLATION_FRONTEND_SERVICE_PORT:-5173}:5173"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- BASE_URL=${TRANSLATION_BACKEND_SERVICE_ENDPOINT}
ipc: host
restart: always
translation-nginx-server:
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
container_name: translation-nginx-server
depends_on:
- translation-backend-server
- translation-ui-server
ports:
- "${TRANSLATION_NGINX_PORT:-80}:80"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- FRONTEND_SERVICE_IP=${TRANSLATION_FRONTEND_SERVICE_IP}
- FRONTEND_SERVICE_PORT=${TRANSLATION_FRONTEND_SERVICE_PORT}
- BACKEND_SERVICE_NAME=${TRANSLATION_BACKEND_SERVICE_NAME}
- BACKEND_SERVICE_IP=${TRANSLATION_BACKEND_SERVICE_IP}
- BACKEND_SERVICE_PORT=${TRANSLATION_BACKEND_SERVICE_PORT}
ipc: host
restart: always
networks:
default:
driver: bridge
21 changes: 21 additions & 0 deletions Translation/docker_compose/amd/gpu/rocm/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# SPDX-License-Identifier: Apache-2.0

export TRANSLATION_HOST_IP=''
export TRANSLATION_EXTERNAL_HOST_IP=''
export TRANSLATION_LLM_MODEL_ID="haoranxu/ALMA-13B"
export TRANSLATION_TGI_LLM_ENDPOINT="http://${TRANSLATION_HOST_IP}:8008"
export TRANSLATION_HUGGINGFACEHUB_API_TOKEN=''
export TRANSLATION_MEGA_SERVICE_HOST_IP=${TRANSLATION_HOST_IP}
export TRANSLATION_LLM_SERVICE_HOST_IP=${TRANSLATION_HOST_IP}
export TRANSLATION_FRONTEND_SERVICE_IP=${TRANSLATION_HOST_IP}
export TRANSLATION_FRONTEND_SERVICE_PORT=18122
export TRANSLATION_BACKEND_SERVICE_NAME=translation
export TRANSLATION_BACKEND_SERVICE_IP=${TRANSLATION_HOST_IP}
export TRANSLATION_BACKEND_SERVICE_PORT=18121
export TRANSLATION_BACKEND_SERVICE_ENDPOINT="http://${TRANSLATION_EXTERNAL_HOST_IP}:${TRANSLATION_BACKEND_SERVICE_PORT}/v1/translation"
export TRANSLATION_NGINX_PORT=18123
Loading
Loading