Skip to content

Aiven-Labs/preparing-data-for-opensearch-and-rag

Prepare your Data for AI Using Aiven for OpenSearch and LangChain

This workshop aims to take unprepared data and make it usable with a Retrieval Augementation Generation (RAG) Pattern for a chat bot.

Learn more about regularly occuring Aiven workshops

Click to Get Started

In this workshop, we'll be using Aiven for OpenSearch and LangChain to:

  • Chunk transcription data and generate embeddings
  • Configure our OpenSearch index for Known Nearest Neighbors (KNN) and perform a similarity search
  • Connect our search responses to an Large Language Model (LLM) to generate informed answers using LangChain
  • Compare the performance of multiple LLMs

Getting Started

Our instructions and notebooks are in the workshop folder.

Click to Get Started

License

The text and materials for this workshop are licensed under the Apache license, version 2.0. Full license text is available in the LICENSE file.

Please note that the project explicitly does not require a CLA (Contributor License Agreement) from its contributors.

Conduit Podcast Transcripts by Jay Miller, Kathy Campbell, original downloads from whisper work done by Pilix is licensed under Attribution-NonCommercial-ShareAlike 4.0 International

Contact

Bug reports and patches are very welcome, please post them as GitHub issues and pull requests at https://github.com/Aiven-Labs/preparing-data-for-opensearch-and-rag

To report any possible vulnerabilities or other serious issues please see our security policy.

Report Code of Conduct issues according to our policy