R2R

R2R (RAG to Riches) is a Python framework designed for the rapid construction and deployment of production-ready Retrieval-Augmented Generation (RAG) systems. This semi-opinionated framework accelerates the transition from experimental stages to production-grade RAG systems.

Developer Installation

Fast Install:

Install R2R directly using pip:
```
pip install r2r
```

Full Install:

Follow these steps to ensure a smooth setup:

Install Poetry:
- Before installing the project, make sure you have Poetry on your system. If not, visit the official Poetry website for installation instructions.
Clone and Install Dependencies:
- Clone the project repository and navigate to the project directory:
```
git clone git@github.com:SciPhi-AI/r2r.git
cd r2r
```
- Install the project dependencies with Poetry:
```
poetry install
```
Configure Environment Variables:
- You need to set up cloud provider secrets in your .env file for the project to work properly. At a minimum, you will need an OpenAI key and a vector database provider.
- For a fast setup, we recommend creating a project on Supabase, enabling the vector extension, and then updating the .env.example with the necessary details.
- Other providers are also available, such as Qdrant for vector database support.
- Once updated, copy the .env.example to .env to apply your configurations:
```
cp .env.example .env
```

Demonstration

qt_exp_1_720p.mp4

Community

Join our Discord server!

Core Abstractions

The framework primarily revolves around three core abstractions:

The Ingestion Pipeline: Facilitates the preparation of embeddable 'Documents' from various data formats (json, txt, pdf, html, etc.). The abstraction can be found in ingestion.py.
The Embedding Pipeline: Manages the transformation of text into stored vector embeddings, interacting with embedding and vector database providers through a series of steps (e.g., extract_text, transform_text, chunk_text, embed_chunks, etc.). The abstraction can be found in embedding.py.
The RAG Pipeline: Works similarly to the embedding pipeline but incorporates an LLM provider to produce text completions. The abstraction can be found in rag.py.

Each pipeline incorporates a logging database for operation tracking and observability.

Running the Examples

The project includes several basic examples that demonstrate application deployment and standalone usage of the embedding and RAG pipelines:

app.py: This example runs the main application, which includes the ingestion, embedding, and RAG pipelines served via FastAPI.
```
poetry run uvicorn examples.basic.app:app
```
test_client.py: This example should be run after starting the main application. It demonstrates a test of the user client.
```
poetry run python -m examples.client.test_client
```
rag_pipeline.py: This standalone example demonstrates the usage of the RAG pipeline. It takes a query as input and returns a completion generated by the OpenAI API.
```
poetry run python -m examples.basic.rag_pipeline
```
embedding_pipeline.py: This standalone example demonstrates the usage of the embedding pipeline. It loads datasets from HuggingFace, generates embeddings for the data using the OpenAI API, and stores the embeddings in a PostgreSQL vector database.
```
poetry run python -m examples.basic.embedding_pipeline
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

R2R

Developer Installation

Fast Install:

Full Install:

Demonstration

Community

Core Abstractions

Running the Examples

Files

README.md

Latest commit

History

README.md

File metadata and controls

R2R

Developer Installation

Fast Install:

Full Install:

Demonstration

Community

Core Abstractions

Running the Examples