Skip to content

A powerful document chatbot that combines vLLM, FastAPI, and Streamlit to provide intelligent responses based on uploaded documents. The system supports both OpenAI GPT and Llama 3 models, featuring real-time streaming responses and conversation memory.

License

Notifications You must be signed in to change notification settings

MohamedSebaie/document-chatbot

Repository files navigation

Document ChatBot with vLLM

A powerful document chatbot that combines vLLM, FastAPI, and Streamlit to provide intelligent responses based on uploaded documents. The system supports both OpenAI GPT and Llama 3 models, featuring real-time streaming responses and conversation memory.

Features

  • 📑 Document Processing: Upload and process PDF documents
  • 💬 Intelligent Chat: Context-aware responses using vLLM
  • 🔍 Document Search: Semantic search using FAISS
  • 🚀 High Performance: Tensor parallelism support with vLLM
  • 🌐 Modern Interface: Streamlit-based UI with real-time responses
  • 📚 Multi-document Support: Chat with multiple uploaded documents
  • 🔄 Context Retention: Maintains conversation context (Working on it)
  • 📈 Source Citations: Provides references for responses

Demo

Demo Preview

Features Demonstrated

  • 📝 Document Upload & Processing
  • 💬 Interactive Chat Interface
  • 🔍 Context-Aware Responses
  • 📚 Source Citations
  • ⚡ Real-time Processing

System Requirements

  • NVIDIA Driver
  • CUDA Toolkit >= 12.1
  • Python 3.10+
  • 16GB+ RAM
  • Ubuntu 22.04 or later

Technical Stack

  • Backend: FastAPI
  • Frontend: Streamlit
  • Models: OpenAI GPT, Llama 3
  • GPU Parallelism: vLLM
  • Vector Store: FAISS
  • Document Processing: PyPDF2, LangChain
  • Containerization: Docker

Project Structure

document-chatbot/
├── app/
│   ├── api/
│   │   └── routes.py         # FastAPI routes
│   └── core/
│       ├── config.py         # Configuration settings
│       ├── document_processor.py  # PDF processing
│       ├── exceptions.py     # Custom exceptions
│       ├── llm_manager.py    # LLM integration
│       └── memory_manager.py # Conversation memory
├── frontend/
│   ├── app.py               # Streamlit application
│   └── components/
│       ├── chat_interface.py    # Streamlit chat interface
│       ├── document_uploader.py # File upload handling
│       └── model_selector.py    # Model selector
├── scripts/
│   └── start_services.py    # Service orchestration
├── data/
│   ├── uploads/             # Document storage
│   └── vector_store/        # FAISS indexes
├── Dockerfile
├── docker-compose.yml
├── vllm_env.yaml            # Conda environment
└── README.md

Installation Methods

Method 1: Direct Installation (Recommended for Development)

1. Install NVIDIA Driver and CUDA Toolkit:

# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/

# Install CUDA
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-1

2. Install Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc

3. Clone the repository:
git clone https://github.com/yourusername/document-chatbot.git
cd document-chatbot

4. Create and activate conda environment:
conda env create -f vllm_env.yaml
conda activate vllm_env

5. Create necessary directories:
mkdir -p data/uploads data/vector_store

6. Start the services:
python scripts/start_services.py

Method 2: Docker Installation (Recommended for Production)

1. Install Docker:

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

# Add your user to docker group
sudo usermod -aG docker $USER
newgrp docker

2. Install NVIDIA Container Toolkit:
# Add NVIDIA package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Install NVIDIA Docker support
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

3. Clone and build:
git clone https://github.com/yourusername/document-chatbot.git
cd document-chatbot

# Build and start services
docker-compose up --build

4. Stop services:
docker-compose down

Usage

Access the interfaces:

Upload Documents:

  • Use the sidebar uploader
  • Support for PDF files
  • Wait for processing confirmation

Chat Interface:

  • Type questions in chat
  • View source citations
  • Clear chat history as needed

License

MIT License - see LICENSE file for details

Acknowledgments

  • vLLM Team for the inference engine
  • Hugging Face for model hosting
  • FastAPI and Streamlit teams

Contact

GitHub Issues: Project Issues Email: [email protected]

About

A powerful document chatbot that combines vLLM, FastAPI, and Streamlit to provide intelligent responses based on uploaded documents. The system supports both OpenAI GPT and Llama 3 models, featuring real-time streaming responses and conversation memory.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published