FastRAG

FastRAG is a simple Retrieval-Augmented Generation (RAG) application optimized for fast performance on general-grade PCs. It provides a chatbot interface that leverages vector-based search and large language models (LLMs) for answering questions and interacting with document-based data.

🚀 Launch API and Demo Locally

To get started with FastRAG locally, follow these steps:

Clone the repository:

git clone https://github.com/bibekyess/FastRAG.git

Navigate to the project directory:
```
cd FastRAG
```
Build and launch the containers:
```
docker compose up --build
```

This will start the FastRAG API and demo with all necessary services.

🛠️ API Endpoints

The FastRAG application launches several API endpoints for different purposes:

Get Conversation History
- Method: GET
- Endpoint: /conversation-history
- Parameters:
  - collection_name (str): Name of the collection to fetch history from.
  - limit (int): Number of history entries to return. Default is 10.
Add to Conversation History
- Method: POST
- Endpoint: /conversation-history
- Body:
  - collection_name (str): Name of the collection to fetch history from.
  - query (str): User input query
  - response_text (str): AI response
Parse Document
- Method: POST
- Endpoint: /parse
- Parameters:
  - file (UploadFile): The document to be parsed.
  - index_id (str): Index name for the document. Default is files.
  - splitting_type (Literal['raw', 'md']): Splitting type for the document. Default is raw (based on chunk settings).
Chat with the Bot
- Method: POST
- Endpoint: /chat
- Body:
  - user_input (str): The user's query.
  - index_id (str): The index to search. Default is "files".
  - llm_text (str): The LLM model to use. Default is "local".
  - dense_top_k (int): The number of top results to return from the vector search. Default is 5.
  - upgrade_user_input (bool): Flag to indicate whether to upgrade the user input from conversation history. Default is False.
  - stream (bool): Flag to enable streaming of results. Default is True.

🖥️ User Interface

Gradio UI: FastRAG features a simple Gradio-based user interface for interacting with the chatbot.
Real-time Chat: Users can upload a document and ask questions in real-time, with previous conversations stored and utilized for context-based improvements. [Providing the option to upload document is in progress]

🗃️ Storage and Database

QdrantDB: The vector embeddings and chatbot conversation history are stored in QdrantDB. This allows the chatbot to utilize previous conversation context for improved responses.

⚡ Model Backend

Model: bartowski/Llama-3.2-3B-Instruct-GGUF

⏱️ Latency Tracking

UI Display: Latency of the chatbot's response is displayed in the Gradio interface.
Logging: Detailed logs of latency and other events are saved for debugging and performance monitoring.

🧾 Document Parsing Options

FastRAG offers multiple options for segmenting documents into chunks:

Raw Format: This option allows experimenting with various chunk sizes, strides, and overlapping settings for raw text parsing.
Markdown Format: This method segments the document based on semantic information, creating more context-aware chunks.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
fastrag		fastrag
logs		logs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile_demo_ui		Dockerfile_demo_ui
Dockerfile_parser_api		Dockerfile_parser_api
Dockerfile_parser_api_dev		Dockerfile_parser_api_dev
README.md		README.md
docker-compose-dev.yaml		docker-compose-dev.yaml
docker-compose.yaml		docker-compose.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastRAG

🚀 Launch API and Demo Locally

🛠️ API Endpoints

🖥️ User Interface

🗃️ Storage and Database

⚡ Model Backend

⏱️ Latency Tracking

🧾 Document Parsing Options

About

Releases

Packages

Languages

bibekyess/FastRAG

Folders and files

Latest commit

History

Repository files navigation

FastRAG

🚀 Launch API and Demo Locally

🛠️ API Endpoints

🖥️ User Interface

🗃️ Storage and Database

⚡ Model Backend

⏱️ Latency Tracking

🧾 Document Parsing Options

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages