From 74ad157d2860ca947cd5623e3ede74fefac6880b Mon Sep 17 00:00:00 2001 From: lvliang-intel Date: Fri, 1 Mar 2024 18:08:51 +0800 Subject: [PATCH] update readme Signed-off-by: lvliang-intel --- templates/gaudi-rag/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/templates/gaudi-rag/README.md b/templates/gaudi-rag/README.md index 106e4fb3b105b..8a85c32cefd7f 100644 --- a/templates/gaudi-rag/README.md +++ b/templates/gaudi-rag/README.md @@ -1,6 +1,6 @@ # Gaudi RAG This template performs RAG using Chroma and Text Generation Inference on Habana Gaudi/Gaudi2. -The Intel Gaudi 2 accelerator supports both deep learning training and inference for AI models like LLMs. The Intel Gaudi 2 accelerator is built on a 7nm process technology. It has a heterogeneous compute architecture that includes dual matrix multiplication engines (MME) and 24 programmable tensor processor cores (TPC). When compared to popular cloud accelerators in the same generation, such as the NVIDIA A100-40GB and A100-80GB, the Gaudi 2 has more memory (96GB of HBM2E), higher memory bandwidth, and higher peak FLOP/s. Note that the AMD MI250 has higher specs per chip but comes in smaller system configurations of only 4xMI250. Gaudi2 is about twice faster than Nvidia A100 80GB for both training and inference, please check[intel-gaudi-2-benchmark](https://www.databricks.com/blog/llm-training-and-inference-intel-gaudi2-ai-accelerators) +The Intel Gaudi 2 accelerator supports both deep learning training and inference for AI models like LLMs. The Intel Gaudi 2 accelerator is built on a 7nm process technology. It has a heterogeneous compute architecture that includes dual matrix multiplication engines (MME) and 24 programmable tensor processor cores (TPC). When compared to popular cloud accelerators in the same generation, such as the NVIDIA A100-40GB and A100-80GB, the Gaudi 2 has more memory (96GB of HBM2E), higher memory bandwidth, and higher peak FLOP/s. Note that the AMD MI250 has higher specs per chip but comes in smaller system configurations of only 4xMI250. Gaudi2 is about twice faster than Nvidia A100 80GB for both training and inference, please check [intel-gaudi-2-benchmark](https://www.databricks.com/blog/llm-training-and-inference-intel-gaudi2-ai-accelerators). ## Environment Setup To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Habana Gaudi/Gaudi2, please follow these steps: