-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1e5db4d
commit 3857988
Showing
19 changed files
with
246 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
The backend provides a REST API to abstract RAG functionalities. The core embarks just enough to query your indexed documents. | ||
|
||
More advanced features (authentication, user sessions, ...) can be enabled through [plugins](plugins/plugins.md). | ||
|
||
### Architecture | ||
|
||
![](backend.png) | ||
|
||
Start the backend server locally: | ||
```shell | ||
python -m uvicorn backend.main:app | ||
``` | ||
> INFO: Application startup complete. | ||
> INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) | ||
|
||
### Base RAG | ||
|
||
The base RAG-as-a-service API is defined at `backend/main.py`: | ||
```python | ||
rag = RAG(config=Path(__file__).parent / "config.yaml") | ||
chain = rag.get_chain() | ||
|
||
app = FastAPI( | ||
title="RAG Accelerator", | ||
description="A RAG-based question answering API", | ||
) | ||
|
||
add_routes(app, chain) | ||
``` | ||
The basic core RAG allows you to load and ask questions about documents. `add_routes` comes straight from Langserve and sets up the basing API routes for chain serving. Our plugins will be added similarly. | ||
|
||
By going to the API documentation (http://0.0.0.0:8000/docs if serving locally) you will have these routes. You can query your RAG directly from here using the `/invoke` endpoint if you want to. | ||
|
||
![base_api.png](base_api.png) | ||
|
||
![base_invoke.png](base_invoke.png) | ||
|
||
You can also query your RAG using the Langserve playground at http://0.0.0.0:8000/playground. It should look like this: | ||
|
||
![base_playground.png](base_playground.png) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
We provide two plugins for user management: secure, and insecure authentication. | ||
|
||
- The secure authentication is the recommended approach when the API is intended to be deployed on a public endpoint. | ||
- The insecure plugin is a little simpler and can be used in testing or when users having access to other people's sessions is not an issue. | ||
|
||
!!! danger "Do not use the insecure auth plugin for deployments exposed to the internet" | ||
This would allow anyone to query your LLM and spend your tokens. | ||
|
||
```python | ||
from backend.api_plugins import insecure_authentication_routes | ||
``` | ||
```python | ||
rag = RAG(config=Path(__file__).parent / "config.yaml") | ||
chain = rag.get_chain() | ||
|
||
app = FastAPI( | ||
title="RAG Accelerator", | ||
description="A RAG-based question answering API", | ||
) | ||
|
||
auth = insecure_authentication_routes(app) | ||
add_routes(app, chain, dependencies=[auth]) | ||
``` | ||
|
||
Similarly than for the sesions before, we add the routes that will allow the users to sign up and login using the `insecure_authentication_routes` plugin. | ||
|
||
|
||
The tricky part is that we need all the existing endpoints to covered by the authentication. To do this we inject `auth` as a dependency of Langchain's `add_routes`. | ||
|
||
We have new user management routes: | ||
![auth_api.png](auth_api.png) | ||
|
||
And now every other route expects an email as a parameter which can be used to retrieve previous chats for examples. | ||
![auth_invoke.png](auth_invoke.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
### Conversational RAG | ||
|
||
Let's say you want to have sessions support for the RAG to be able hold a conversation rather than just answer standalone questions: | ||
```python | ||
from backend.api_plugins import session_routes | ||
``` | ||
```python | ||
rag = RAG(config=Path(__file__).parent / "config.yaml") | ||
chain = rag.get_chain(memory=True) | ||
|
||
app = FastAPI( | ||
title="RAG Accelerator", | ||
description="A RAG-based question answering API", | ||
) | ||
|
||
add_routes(app, chain) | ||
session_routes(app) | ||
``` | ||
|
||
We have added two things here: | ||
|
||
- We set `memory=True` in `RAG.get_chain`. That will create a slightly different chain than before. This new chain adds memory handling capabilities to our RAG. | ||
- We imported and called the `session_routes` plugin. | ||
|
||
We will now have new session management routes available in the API: | ||
![sessions_api.png](sessions_api.png) | ||
|
||
And also, the playground now takes a `SESSION ID` configuration: | ||
![sessions_playground.png](sessions_playground.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
Plugins are used to add functionalities to the API as you need them. | ||
|
||
We provide a few plugins out of the box in `backend/api_plugins`, and you will also be able to create your own. If you write a useful plugin, don't hesitate to open a PR! | ||
|
||
A plugin takes the form of a function that wraps all the FastAPI routes it introduces. | ||
|
||
### Data model | ||
|
||
Plugins may need special database tables to properly function. You can bundle a SQL script that will add this table if it dosen't exist when the plugin is instantiated. For example, the authentication plugins adds a table that stores users. | ||
|
||
`users_tables.sql`: | ||
```sql | ||
CREATE TABLE IF NOT EXISTS "users" ( | ||
"email" VARCHAR(255) PRIMARY KEY, | ||
"password" TEXT | ||
); | ||
``` | ||
```python | ||
def authentication_routes(app, dependencies=List[Depends]): | ||
from backend.database import Database | ||
with Database() as connection: | ||
connection.run_script(Path(__file__).parent / "users_tables.sql") | ||
|
||
# rest of the plugin | ||
``` | ||
|
||
### Dependencies | ||
|
||
Plugins should allow for dependency injection. In practice that means the wrapper function should accept a list of FastAPI `Depends` object and pass it to all the wrapped routes. For example, the sessions plugin takes an unspecified list of dependencies that may be needed in the future, and an explicit auth dependency to link sessions to users. [Learn more about FastAPI dependencies here.](https://fastapi.tiangolo.com/tutorial/dependencies/) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
Now we bring it all together: sessions, and secure authentication. By combining the sessions plugin and the secure authentication plugin, we can have user-specific profiles which are completely distinct from one another. | ||
|
||
```python | ||
from backend.api_plugins import session_routes, authentication_routes | ||
``` | ||
```python | ||
rag = RAG(config=Path(__file__).parent / "config.yaml") | ||
chain = rag.get_chain(memory=True) | ||
|
||
app = FastAPI( | ||
title="RAG Accelerator", | ||
description="A RAG-based question answering API", | ||
) | ||
|
||
auth = authentication_routes(app) | ||
session_routes(app, authentication=auth) | ||
add_routes(app, chain, dependencies=[auth]) | ||
``` | ||
|
||
Here our authentication plugin is injected in both the sessions and core routes. With this setup, all calls will need to be authenticated with a bearer token that the API provides after a sucessful login. | ||
|
||
Notice the locks pictograms on every route. These indicate the routes are protected by our authentication scheme. You can still query your RAG using this interface by first login through the `Authorize` button. The Langserve playground does not support this however, and is not usable anymore. | ||
![sec_auth_api.png](sec_auth_api.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
## The `RAG` class | ||
|
||
The RAG class orchestrates the components necessary for a retrieval-augmented generation pipeline. | ||
It initializes with a configuration, either directly or from a file. | ||
|
||
![RAG](RAG.png) | ||
|
||
The RAG object has two main purposes: | ||
|
||
- loading the RAG with documents, which involves ingesting and processing documents to be retrievable by the system | ||
- generating the chain from the components as specified in the configuration, which entails assembling the various components (language model, embeddings, vector store) into a coherent pipeline for generating responses based on retrieved information. | ||
|
||
|
||
!!! example "Loading and querying documents" | ||
```python | ||
from pathlib import Path | ||
from backend.rag_components.rag import RAG | ||
|
||
rag = RAG(config=Path(__file__).parent / "backend" / "config.yaml") | ||
chain = rag.get_chain() | ||
|
||
print(chain.invoke("Who is bill Gates?")) | ||
# > content='Documents have not been provided, and thus I am unable to give a response based on them. Would you like me to answer based on general knowledge instead?' | ||
|
||
rag.load_file(Path(__file__).parent / "data_sample" / "billionaires.csv") | ||
# > loader selected CSVLoader for /.../data_sample/billionaires.csv | ||
# > {'event': 'load_documents', 'num_added': 2640, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 0} | ||
|
||
print(chain.invoke("Who is bill Gates?")) | ||
# > content='Bill Gates is a 67-year-old businessman from the United States, residing in Medina, Washington. He is the co-chair of the Bill & Melinda Gates Foundation and is recognized for his self-made success, primarily through Microsoft in the technology industry. As of the provided document dated April 4, 2023, Bill Gates has a final worth of $104 billion, ranking him 6th in the category of Technology. His full name is William Gates, and he was born on October 28, 1955.' | ||
``` | ||
|
||
## `RAGConfig` | ||
|
||
Configuration of the RAG is done using the `RAGConfig` dataclass. You can instanciate one directly in python, but the preferred way is to use the `backend/config.yaml` file. This YAML is then automatically parsed into a `RAGConfig` that can be fed to the `RAG` class. | ||
|
||
The configuration provides you with a way to input which implementation you want to use for each RAG components: | ||
|
||
- The LLM | ||
- The embedding model | ||
- The vector store / retreiver | ||
- The memory / database | ||
|
||
Zooming in on the `LLMConfig` as an example: | ||
```python | ||
@dataclass | ||
class LLMConfig: | ||
source: BaseChatModel | LLM | str | ||
source_config: dict | ||
temperature: float | ||
``` | ||
|
||
- `source` is the name of name of the langchain class name of your model, either a `BaseChatModel` or `LLM`. | ||
- `source_config` is are the parameters used to instanciate the `source`. | ||
- `temperature` regulates the unpredictability of a language model's output. | ||
|
||
Example of a configuration that uses a local model served with Ollama. In `backend/config.yaml`: | ||
```yaml | ||
LLMConfig: &LLMConfig | ||
source: ChatOllama | ||
source_config: | ||
model: tinyllama | ||
temperature: 0 | ||
``` | ||
!!! info "Configuration recipes" | ||
You can find fully tested recipes for LLMConfig, VectorStoreConfig, EmbeddingModelConfig, and DatabaseConfig [in the Cookbook](../cookbook/cookbook.md). | ||
This is the python equivalent that is generated and executed under the hood when a `RAG` object is created. | ||
```python | ||
llm = ChatOllama(model="tinyllama", temperature=0) | ||
``` | ||
|
||
|
||
You can also write the configurations directly in python, although that's not the recommended approach here. | ||
```python | ||
from langchain_community.chat_models import ChatOllama | ||
from backend.config import LLMConfig | ||
llm_config = LLMConfig( | ||
source=ChatOllama, | ||
source_config={"model": "llama2", "temperature": 0}, | ||
) | ||
``` | ||
|
||
### Extending the `RAGConfig` | ||
|
||
See: [How to extend the RAGConfig](../cookbook/extend_ragconfig.md) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.