Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
Signed-off-by: lvliang-intel <[email protected]>
  • Loading branch information
lvliang-intel committed Mar 1, 2024
1 parent bc96093 commit 74ad157
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion templates/gaudi-rag/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Gaudi RAG
This template performs RAG using Chroma and Text Generation Inference on Habana Gaudi/Gaudi2.
The Intel Gaudi 2 accelerator supports both deep learning training and inference for AI models like LLMs. The Intel Gaudi 2 accelerator is built on a 7nm process technology. It has a heterogeneous compute architecture that includes dual matrix multiplication engines (MME) and 24 programmable tensor processor cores (TPC). When compared to popular cloud accelerators in the same generation, such as the NVIDIA A100-40GB and A100-80GB, the Gaudi 2 has more memory (96GB of HBM2E), higher memory bandwidth, and higher peak FLOP/s. Note that the AMD MI250 has higher specs per chip but comes in smaller system configurations of only 4xMI250. Gaudi2 is about twice faster than Nvidia A100 80GB for both training and inference, please check[intel-gaudi-2-benchmark](https://www.databricks.com/blog/llm-training-and-inference-intel-gaudi2-ai-accelerators)
The Intel Gaudi 2 accelerator supports both deep learning training and inference for AI models like LLMs. The Intel Gaudi 2 accelerator is built on a 7nm process technology. It has a heterogeneous compute architecture that includes dual matrix multiplication engines (MME) and 24 programmable tensor processor cores (TPC). When compared to popular cloud accelerators in the same generation, such as the NVIDIA A100-40GB and A100-80GB, the Gaudi 2 has more memory (96GB of HBM2E), higher memory bandwidth, and higher peak FLOP/s. Note that the AMD MI250 has higher specs per chip but comes in smaller system configurations of only 4xMI250. Gaudi2 is about twice faster than Nvidia A100 80GB for both training and inference, please check [intel-gaudi-2-benchmark](https://www.databricks.com/blog/llm-training-and-inference-intel-gaudi2-ai-accelerators).

## Environment Setup
To use [🤗 text-generation-inference](https://github.com/huggingface/text-generation-inference) on Habana Gaudi/Gaudi2, please follow these steps:
Expand Down

0 comments on commit 74ad157

Please sign in to comment.