Skip to content

Commit

Permalink
fix: doc
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexisVLRT committed Feb 21, 2024
1 parent 1e5db4d commit 3857988
Show file tree
Hide file tree
Showing 19 changed files with 246 additions and 0 deletions.
Binary file added docs/backend/RAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions docs/backend/backend.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
The backend provides a REST API to abstract RAG functionalities. The core embarks just enough to query your indexed documents.

More advanced features (authentication, user sessions, ...) can be enabled through [plugins](plugins/plugins.md).

### Architecture

![](backend.png)

Start the backend server locally:
```shell
python -m uvicorn backend.main:app
```
> INFO: Application startup complete.
> INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

### Base RAG

The base RAG-as-a-service API is defined at `backend/main.py`:
```python
rag = RAG(config=Path(__file__).parent / "config.yaml")
chain = rag.get_chain()

app = FastAPI(
title="RAG Accelerator",
description="A RAG-based question answering API",
)

add_routes(app, chain)
```
The basic core RAG allows you to load and ask questions about documents. `add_routes` comes straight from Langserve and sets up the basing API routes for chain serving. Our plugins will be added similarly.

By going to the API documentation (http://0.0.0.0:8000/docs if serving locally) you will have these routes. You can query your RAG directly from here using the `/invoke` endpoint if you want to.

![base_api.png](base_api.png)

![base_invoke.png](base_invoke.png)

You can also query your RAG using the Langserve playground at http://0.0.0.0:8000/playground. It should look like this:

![base_playground.png](base_playground.png)
Binary file added docs/backend/backend.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/base_api.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/base_invoke.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/base_playground.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/playground.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/plugins/auth_api.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/plugins/auth_invoke.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 34 additions & 0 deletions docs/backend/plugins/authentication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
We provide two plugins for user management: secure, and insecure authentication.

- The secure authentication is the recommended approach when the API is intended to be deployed on a public endpoint.
- The insecure plugin is a little simpler and can be used in testing or when users having access to other people's sessions is not an issue.

!!! danger "Do not use the insecure auth plugin for deployments exposed to the internet"
This would allow anyone to query your LLM and spend your tokens.

```python
from backend.api_plugins import insecure_authentication_routes
```
```python
rag = RAG(config=Path(__file__).parent / "config.yaml")
chain = rag.get_chain()

app = FastAPI(
title="RAG Accelerator",
description="A RAG-based question answering API",
)

auth = insecure_authentication_routes(app)
add_routes(app, chain, dependencies=[auth])
```

Similarly than for the sesions before, we add the routes that will allow the users to sign up and login using the `insecure_authentication_routes` plugin.


The tricky part is that we need all the existing endpoints to covered by the authentication. To do this we inject `auth` as a dependency of Langchain's `add_routes`.

We have new user management routes:
![auth_api.png](auth_api.png)

And now every other route expects an email as a parameter which can be used to retrieve previous chats for examples.
![auth_invoke.png](auth_invoke.png)
29 changes: 29 additions & 0 deletions docs/backend/plugins/conversational_rag_plugin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
### Conversational RAG

Let's say you want to have sessions support for the RAG to be able hold a conversation rather than just answer standalone questions:
```python
from backend.api_plugins import session_routes
```
```python
rag = RAG(config=Path(__file__).parent / "config.yaml")
chain = rag.get_chain(memory=True)

app = FastAPI(
title="RAG Accelerator",
description="A RAG-based question answering API",
)

add_routes(app, chain)
session_routes(app)
```

We have added two things here:

- We set `memory=True` in `RAG.get_chain`. That will create a slightly different chain than before. This new chain adds memory handling capabilities to our RAG.
- We imported and called the `session_routes` plugin.

We will now have new session management routes available in the API:
![sessions_api.png](sessions_api.png)

And also, the playground now takes a `SESSION ID` configuration:
![sessions_playground.png](sessions_playground.png)
29 changes: 29 additions & 0 deletions docs/backend/plugins/plugins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Plugins are used to add functionalities to the API as you need them.

We provide a few plugins out of the box in `backend/api_plugins`, and you will also be able to create your own. If you write a useful plugin, don't hesitate to open a PR!

A plugin takes the form of a function that wraps all the FastAPI routes it introduces.

### Data model

Plugins may need special database tables to properly function. You can bundle a SQL script that will add this table if it dosen't exist when the plugin is instantiated. For example, the authentication plugins adds a table that stores users.

`users_tables.sql`:
```sql
CREATE TABLE IF NOT EXISTS "users" (
"email" VARCHAR(255) PRIMARY KEY,
"password" TEXT
);
```
```python
def authentication_routes(app, dependencies=List[Depends]):
from backend.database import Database
with Database() as connection:
connection.run_script(Path(__file__).parent / "users_tables.sql")

# rest of the plugin
```

### Dependencies

Plugins should allow for dependency injection. In practice that means the wrapper function should accept a list of FastAPI `Depends` object and pass it to all the wrapped routes. For example, the sessions plugin takes an unspecified list of dependencies that may be needed in the future, and an explicit auth dependency to link sessions to users. [Learn more about FastAPI dependencies here.](https://fastapi.tiangolo.com/tutorial/dependencies/)
Binary file added docs/backend/plugins/sec_auth_api.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Binary file added docs/backend/plugins/sessions_api.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/backend/plugins/sessions_playground.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions docs/backend/plugins/user_based_sessions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Now we bring it all together: sessions, and secure authentication. By combining the sessions plugin and the secure authentication plugin, we can have user-specific profiles which are completely distinct from one another.

```python
from backend.api_plugins import session_routes, authentication_routes
```
```python
rag = RAG(config=Path(__file__).parent / "config.yaml")
chain = rag.get_chain(memory=True)

app = FastAPI(
title="RAG Accelerator",
description="A RAG-based question answering API",
)

auth = authentication_routes(app)
session_routes(app, authentication=auth)
add_routes(app, chain, dependencies=[auth])
```

Here our authentication plugin is injected in both the sessions and core routes. With this setup, all calls will need to be authenticated with a bearer token that the API provides after a sucessful login.

Notice the locks pictograms on every route. These indicate the routes are protected by our authentication scheme. You can still query your RAG using this interface by first login through the `Authorize` button. The Langserve playground does not support this however, and is not usable anymore.
![sec_auth_api.png](sec_auth_api.png)
89 changes: 89 additions & 0 deletions docs/backend/rag_ragconfig.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
## The `RAG` class

The RAG class orchestrates the components necessary for a retrieval-augmented generation pipeline.
It initializes with a configuration, either directly or from a file.

![RAG](RAG.png)

The RAG object has two main purposes:

- loading the RAG with documents, which involves ingesting and processing documents to be retrievable by the system
- generating the chain from the components as specified in the configuration, which entails assembling the various components (language model, embeddings, vector store) into a coherent pipeline for generating responses based on retrieved information.


!!! example "Loading and querying documents"
```python
from pathlib import Path
from backend.rag_components.rag import RAG

rag = RAG(config=Path(__file__).parent / "backend" / "config.yaml")
chain = rag.get_chain()

print(chain.invoke("Who is bill Gates?"))
# > content='Documents have not been provided, and thus I am unable to give a response based on them. Would you like me to answer based on general knowledge instead?'

rag.load_file(Path(__file__).parent / "data_sample" / "billionaires.csv")
# > loader selected CSVLoader for /.../data_sample/billionaires.csv
# > {'event': 'load_documents', 'num_added': 2640, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 0}

print(chain.invoke("Who is bill Gates?"))
# > content='Bill Gates is a 67-year-old businessman from the United States, residing in Medina, Washington. He is the co-chair of the Bill & Melinda Gates Foundation and is recognized for his self-made success, primarily through Microsoft in the technology industry. As of the provided document dated April 4, 2023, Bill Gates has a final worth of $104 billion, ranking him 6th in the category of Technology. His full name is William Gates, and he was born on October 28, 1955.'
```

## `RAGConfig`

Configuration of the RAG is done using the `RAGConfig` dataclass. You can instanciate one directly in python, but the preferred way is to use the `backend/config.yaml` file. This YAML is then automatically parsed into a `RAGConfig` that can be fed to the `RAG` class.

The configuration provides you with a way to input which implementation you want to use for each RAG components:

- The LLM
- The embedding model
- The vector store / retreiver
- The memory / database

Zooming in on the `LLMConfig` as an example:
```python
@dataclass
class LLMConfig:
source: BaseChatModel | LLM | str
source_config: dict
temperature: float
```

- `source` is the name of name of the langchain class name of your model, either a `BaseChatModel` or `LLM`.
- `source_config` is are the parameters used to instanciate the `source`.
- `temperature` regulates the unpredictability of a language model's output.

Example of a configuration that uses a local model served with Ollama. In `backend/config.yaml`:
```yaml
LLMConfig: &LLMConfig
source: ChatOllama
source_config:
model: tinyllama
temperature: 0
```
!!! info "Configuration recipes"
You can find fully tested recipes for LLMConfig, VectorStoreConfig, EmbeddingModelConfig, and DatabaseConfig [in the Cookbook](../cookbook/cookbook.md).
This is the python equivalent that is generated and executed under the hood when a `RAG` object is created.
```python
llm = ChatOllama(model="tinyllama", temperature=0)
```


You can also write the configurations directly in python, although that's not the recommended approach here.
```python
from langchain_community.chat_models import ChatOllama
from backend.config import LLMConfig
llm_config = LLMConfig(
source=ChatOllama,
source_config={"model": "llama2", "temperature": 0},
)
```

### Extending the `RAGConfig`

See: [How to extend the RAGConfig](../cookbook/extend_ragconfig.md)
Binary file added docs/backend/swagger_base.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3857988

Please sign in to comment.