Releases: opea-project/GenAIComps
Releases · opea-project/GenAIComps
Generative AI Components v1.0 Release Notes
OPEA Release Notes v1.0
What’s New in OPEA v1.0
-
Highlights
- Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
- Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
- Improve RAG with Knowledge Graph based on Neo4j
- Improve VisualQnA and provide multi-modality RAG support
- Faster microservice launch through removal of some dispatch overhead
- Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
- Enable HorizontalPodAutoscaler (HPA) for better resource management
- Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
- Further improvement on documentation and developer experience
-
Other features
- Enable OpenAI compatible format on applicable microservices
- Support microservice launch from ModelScope to address China ecosystem need
- Support Red Hat OpenShift Container Platform (RHOCP)
- Refactor the code and CI/CD pipeline to provide better support for contributors
- Improve Docker versioning to avoid the potential conflict
- Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
- Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
-
Learn more about OPEA at
- Getting Started: https://opea-project.github.io/latest/index.html
- Github: https://github.com/opea-project
- Docker Hub: https://hub.docker.com/u/opea
-
Release Documentation:
- Landing Page: https://opea.dev/
- Release Notes: https://github.com/opea-project/docs/tree/main/release_notes
Details
GenAIExamples
-
Deployment
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Update mount path in xeon k8s(2a6af64)
- Add Nginx - k8s manifest in CodeTrans(6a679ba)
- Add Nginx - docker in CodeTrans(cc84847)
- watch more docker compose files changes(4b0bc26)
- Add chatQnA UI manifest(758d236)
- Revert the LLM model for kubernetes GMS(f5f1e32)
- [ChatQnA] Update retrieval & dataprep manifests(6730b24)
- [ChatQnA]Update manifests(3563f5d)
- [ChatQnA] Update benchmarking manifests(36fb9a9)
- [ChatQnA] udate OOB & Tuned manifests(ac34860)
- Add nginx and UI to the ChatQnA manifest(05f9828)
- [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
- [Translation] Support manifests and nginx(1e13031)
- update V1.0 benchmark manifest (e5affb9)
- update image name(e2a74f7)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Change megaservice path in line with new file structure(5ab27b6)
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- Add chatQnA UI manifest(758d236)
- Yaml: add comments to specify gaudi device ids.(63406dc)
- add tgi bf16 setup on CPU k8s.(ba17031)
-
Documentation
- [ChatQnA] Update README for ModelScope(aebc23f)
- Update README.md(4bd7841)
- [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
- [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
- Fix readme for nv gpu(43b2ae5)
- [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
- Refine ChatQnA README for TGI(afc3341)
- Add default model for VisualQnA README(07baa8f)
- Update readme for manifests of some examples(adb157f)
- doc: use markdown table in supported_examples(9cf1d88)
- doc: remove invalid code block language(c6d811a)
- add AudioQnA readme with supported model(f4f4da2)
- add more code owners(7f89797)
- doc: fix headings(7a0fca7)
- [Codegen] Refine readme to prompt users on how to change the model.(814164d)
- Update README.md and remove some open-source details(2ef83fc)
- Add issue template(84a781a)
- doc: fix headings and indenting(67394b8)
- Add default model in readme for FaqGen and DocSum(d487093)
- Change docs of kubernetes for curl commands in README(4133757)
- Update v0.9 RAG release data(947936e)
- Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
- Update docker images list.(a8244c4)
- refactor the network port setting for AWS(bc81770)
- Add validate microservice details link(bd811bd)
- [ChatQnA] Add Nginx in Docker Compose and README(6c36448
- [Doc] Update CodeGen and Translation READMEs(a09395e)
- [Doc] Refine READMEs(372d78c)
- Remove marketing materials(d85ec09)
- doc PR to main instead of of v1.0r(dc94026)
- Update README.md for Multiplatforms(b205dc7)
- Refine the quick start of ChatQnA(3b70fb0)
- Update supported_examples(96d5cd9)
- [Doc] doc improvement(e0b3b57)
- Fix README issues(bceacdc)
- doc: fix broken image reference and markdown(d422929)
- doc: give document meaningful title(a3fa0d6)
- doc: fix incorrefine readme for reorg(d2bab99)
- doc: fix incorrect path to png image files (d97882e)
- update doc according to comments(f990f79)
- doc: fix headings and indenting(67394b8)
- Update README.md(4bd7841)
- refine readme for reorg(d2bab99)
- Update README with new examples(2d28beb)
- README: fix broken links(ff6f841)
- Update v0.9 RAG release data([947936e](https://github....
Generative AI Components v0.9 Release Notes
OPEA Release Notes v0.9
What’s New in OPEA v0.9
-
Broaden functionality
- Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger
- Initialize two Agent examples: AgentQnA and DocIndexRetriever
- Support for authentication and authorization
- Add Nginx Component to strengthen backend security
- Provide Toxicity Detection Microservice
- Support the experimental Fine-tuning microservice
-
Enhancement
- Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning... etc)
- Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, ...etc
- Enable support for launching container images as a non-root user
- Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model
- Add ProductivitySuite to provide access management and maintains user context
-
Deployment
- Support Red Hat OpenShift Container Platform (RHOCP)
- GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs
- Add Kubernetes support for AudioQnA and VisualQnA examples
-
OPEA Docker Hub: https://hub.docker.com/u/opea
-
Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu
, Michal Nicpon and Jacob Mansdorfer
Details
GenAIExamples
-
ChatQnA
- Update port in set_env.sh(040d2b7)
- Fix minor issue in ChatQnA Gaudi docker README(a5ed223)
- update chatqna dataprep-redis port(02a1536)
- Add support for .md file in file upload in the chatqna-ui(7a67298)
- Added the ChatQnA delete feature, and updated the corresponding README(09a3196)
- fixed ISSUE-528(45cf553)
- Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)
- set OLLAMA_MODEL env to docker container(c297155)
- Update guardrail docker file path(06c4484)
- remove ray serve(c71bc68)
- Refine docker_compose for dataprep param settings(3913c7b)
- fix chatqna guardrails(db2d2bd)
- Support ChatQnA pipeline without rerank microservice(a54ffd2)
- Update the number of microservice replicas for OPEA v0.9(e6b4fff)
- Update set_env.sh(9657f7b)
- add env for chatqna vllm(f78aa9e)
-
Deployment
- update manifests for v0.9(ba78b4c)
- Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)
- Update benchmark manifest to fix errors(4fd3517)
- Update env for manifest(4fa37e7)
- update manifests for v0.9(08f57fa)
- Add AudioQnA example via GMC(c86cf85)
- add k8s support for audioqna(0a6bad0)
- Update mainifest for FaqGen(80e3e2a)
- Add kubernetes support for VisualQnA(4f7fc39)
- Add dataprep microservice to chatQnA example and the e2e test(1c23d87)
-
Documentation
- [doc] Update README.md(c73e4e0)
- doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)
- doc: fix markdown in docker_image_list.md(9277fe6)
- doc: fix markdown in Translation/README.md(d645305)
- doc: fix markdown in SearchQnA/README.md(c461b60)
- doc: fix FaqGen/README.md markdown(704ec92)
- doc: fix markdown in DocSum/README.md(83712b9)
- doc: fix markdown in CodeTrans/README.md(076bca3)
- doc: fix CodeGen/README.md markdown(33f8329)
- doc: fix markdown in ChatQnA/README.md(015a2b1)
- doc: fix headings in markdown files(21fab71)
- doc: missed an H1 in the middle of a doc(4259240)
- doc: remove use of HTML for table in README(e81e0e5)
- Update ChatQnA readme with OpenShift instructions(ed48371)
- Convert HTML to markdown format.(14621f8)
- Fix typo {your_ip} to {host_ip}(ad8ca88)
- README fix typo(abc02e1)
- fix script issues in MD file(acdd712)
- Minor documentation improvements in the CodeGen README(17b9676)
- Refine Main README(08eb269)
- [Doc]Add a micro/mega service WorkFlow for DocSum(343d614)
- Update README for k8s deployment(fbb81b6)
-
Other examples
- Clean deprecated VisualQnA code(87617e7)
- Using TGI official release docker image for intel cpu(b2771ad)
- Add VisualQnA UI(923cf69)
- fix container name(5ac77f7)
- Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)
- Remove LangSmith from Examples(88eeb0d)
- Modify the language variable to match language highlight.(f08d411)
- Remove deprecated folder.(7dd9952)
- update env for manifest(4fa37e7)
- AgentQnA example(67df280)
- fix tgi xeon tag(6674832)
- Add new DocIndexRetriever example(566cf93)
- Add env params for chatqna xeon test(5d3950)
- ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)
- change codegen tgi model(06cb308)
- change searchqna prompt(acbaaf8)
- minor fix mismatched hf token(ac324a9)
- fix translation gaudi env(4f3be23)
- Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)
-
CI/CD/UT
Generative AI Components v0.8 Release Notes
OPEA Release Notes v0.8
What’s New in OPEA v0.8
-
Broaden functionality
- Support frequently asked questions (FAQs) generation GenAI example
- Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
- Enable end-to-end performance and accuracy benchmarking
- Support the experimental Agent microservice
- Support LLM serving on Ray
-
Multi-platform support
- Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
- Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
- Enable the experimental authentication and authorization support using JWT tokens
- Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
-
OPEA Docker Hub: https://hub.docker.com/u/opea
Details
GenAIExamples
-
ChatQnA
- Add ChatQnA instructions for AIPC(26d4ff)
- Adapt Vllm response format (034541)
- Update tgi version(5f52a1)
- Update README.md(f9312b)
- Udpate ChatQnA docker compose for Dataprep Update(335362)
- [Doc] Add valid micro-service details(e878dc)
- Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
- Fix win PC issues(ba6541)
- [Doc]Add ChatQnA Flow Chart(97da49)
- Add guardrails in the ChatQnA pipeline(955159)
- Fix a minor bug for chatqna in docker-compose(b46ae8)
- Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
- Added ChatQnA example using Qdrant retriever(c74564)
- Update TEI version v1.5 for better performance(f4b4ac)
- Update ChatQnA upload feature(598484)
- Add auto truncate for embedding and rerank(8b6094)
-
Deployment
- Add Kubernetes manifest files for deploying DocSum(831463)
- Update Kubernetes manifest files for CodeGen(2f9397)
- Add Kubernetes manifest files for deploying CodeTrans(c9548d)
- Updated READMEs for kubernetes example pipelines(c37d9c)
- Update all examples yaml files of GMC in GenAIExample(290a74)
- Doc: fix minor issue in GMC doc(d99461)
- README for installing 4 worklods using helm chart(6e797f)
- Update Kubernetes manifest files for deploying ChatQnA(665c46)
- Add new example of SearchQnA for GenAIExample(21b7d1)
- Add new example of Translation for GenAIExample(d0b028)
-
Other examples
- Update reranking microservice dockerfile path (d7a5b7)
- Update tgi-gaudi version(3505bd)
- Refine README of Examples(f73267)
- Update READMEs(8ad7f3)
- [CodeGen] Add codegen flowchart(377dd2)
- Update audioqna image name(615f0d)
- Add auto-truncate to gaudi tei (8d4209)
- Update visualQnA chinese version(497895)
- Fix Typo for Translation Example(95c13d)
- FAQGen Megaservice(8c4a25)
- Code-gen-react-ui(1b48e5)
- Added doc sum react-ui(edf0d1)
-
CI/UT
- Frontend failed with unknown timeout issue (7ebe78)
- Adding Chatqna Benchmark Test(11a56e)
- Expand tgi connect timeout(ee0dcb)
- Optimize gmc manifest e2e tests(15fc6f)
- Add docker compose yaml print for test(bb4230)
- Refactor translation ci test (b7975e)
- Refactor searchqna ci test(ecf333)
- Translate UT for UI(284d85)
- Enhancement the codetrans e2e test(450efc)
- Allow gmc e2e workflow to get secrets(f45f50)
- Add checkout ref in gmc e2e workflow(62ae64)
- SearchQnA UT(268d58)
GenAIComps
-
Cores
-
LLM
- Optional vllm microservice container build(963755)
- Refine vllm instruction(6e2c28)
- Introduce 'entrypoint.sh' for some Containers(9ecc5c)
- Support llamaindex for retrieval microservice and remove langchain(61795f)
- Update tgi with text-generation-inference:2.1.0(f23694)
- Fix requirements(f4b029)
- Add vLLM on Ray microservice(ec3b2e)
- Update code/readme/UT for Ray Serve and VLLM([dd939c](https://gith...
Generative AI Components v0.7 Release Notes
GenAIComps
-
Cores
-
LLM
- Support Qwen2 in LLM Microservice(3f5cde)
- Fix the vLLM docker compose issues(3d134d)
- Enable vLLM Gaudi support for LLM service based on officially habana vllm release(0dedc2)
- Openvino support in vllm(7dbad0)
- Support Ollama microservice(a00e36)
- Support vLLM XFT LLM microservice(2a6a29, 309c2d, fe5f39)
- Add e2e test for llm summarization tgi(e8ebd9)
-
DataPrep
- Support Dataprep(f7443f), embedding(f37ce2) microservice with Llama Index
- Fix dataprep microservice path issue(e20acc)
- Add milvus microservice(e85033)
- Add Ray version for multi file process(40c1aa)
- Fix dataprep timeout issue(61ead4)
- Add e2e test for dataprep redis langchain(6b7bec)
- Supported image summarization with LVM in dataprep microservice(86412c)
- Enable conditional splitting for html files(e1dad1)
- Added support for pyspark in dataprep microservice(a5eb14)
- DataPrep extract info from table in the docs(953e78)
- Added support for extracting info from image in the docs(e23745)
-
Other Components
- Add PGvector support in Vectorstores(1b7001) and Retriever(75eff6), Dataprep(9de3c7)
- Add Mosec embedding(f76685) and reranking(a58ca4)
- Add knowledge graph components(4c0afd)
- Add LVMs LLaVA component(bd385b)
- Add asr/tts components for xeon and hpu(cef6ea)
- Add WebSearch Retriever Microservice(900178)
- Add initial pii detection microservice(e38041)
- Pinecone support for dataprep and retrieval microservice(8b6486)
- Support prometheus metrics for opea microservices(758914), (900178)
- Add no_proxy env for micro services(df0c11)
- Enable RAGAS(8a670e)
- Fix RAG performance issues(70c23d)
- Support rerank and retrieval of RAG OPT(b51675)
- Reranking using an optimized bi-encoder(574847)
- Use parameter for retriever(358dbd), reranker(dfdd08)
-
CI
Others
Generative AI Components v0.6 Release Notes
GenAIComps
- Activate a suite of microservices including ASR, LLMS, Rerank, Embedding, Guardrails, TTS, Telemetry, DataPrep, Retrieval, and VectorDB.
- ASR functionality is fully operational on Xeon architecture, pending readiness on Gaudi.
- Retrieval capabilities are functional on LangChain, awaiting readiness on LlamaIndex.
- VectorDB functionality is supported on Redis, Chroma, and Qdrant, with readiness pending on SVS.
- Added 14 file formats support in data preparation microservices and enabled Safeguard of conversation in guardrails.
- Added the Ray Gaudi Supported for LLM Service.