Skip to content

Commit

Permalink
refacto(backend): poetry package manager and chat route refactoring (#…
Browse files Browse the repository at this point in the history
…2684)

# Description
- Added package manager
- Added precommit checks
- Rewrote dependency injection of Services and Repositories
- Integrate async SQL alchemy engine
- Migrate Chat  repository to SQLModel 
- Migrated ChatHistory repository to SQLModel
- User SQLModel
- Unit test methodology with db rollback
- Unit tests ChatRepository
- Test ChatService get_history
- Brain entity SQL Model
- Promp SQLModel
- Rewrite chat/{chat_id}/question route
- updated docker files and docker compose in dev and production

Added `quivr_core` subpackages:
- Refactored KnowledgebrainQa
- Added Rag service to interface with non-rag dependencies

---------

Co-authored-by: aminediro <[email protected]>
  • Loading branch information
AmineDiro and AmineDiro authored Jun 26, 2024
1 parent 1751504 commit ca93cb9
Show file tree
Hide file tree
Showing 420 changed files with 13,339 additions and 2,496 deletions.
3 changes: 2 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# This file is used to configure the Quivr stack. It is used by the `docker-compose.yml` file to configure the stack.

# OPENAI. Update this to use your API key. To skip OpenAI integration use a fake key, for example: tk-aabbccddAABBCCDDEeFfGgHhIiJKLmnopjklMNOPqQqQqQqQ
OPENAI_API_KEY=CHANGE_ME
OPENAI_API_KEY=CHANGE_ME

# LOCAL
# OLLAMA_API_BASE_URL=http://host.docker.internal:11434 # Uncomment to activate ollama. This is the local url for the ollama api
Expand All @@ -28,6 +28,7 @@ NEXT_PUBLIC_AUTH_MODES=password
SUPABASE_URL=http://host.docker.internal:54321
SUPABASE_SERVICE_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZS1kZW1vIiwicm9sZSI6InNlcnZpY2Vfcm9sZSIsImV4cCI6MTk4MzgxMjk5Nn0.EGIM96RAZx35lJzdJsyH-qQwv8Hdp7fsn3W0YpN81IU
PG_DATABASE_URL=postgresql://postgres:[email protected]:54322/postgres
PG_DATABASE_ASYNC_URL=postgresql+asyncpg://postgres:[email protected]:54322/postgres
ANTHROPIC_API_KEY=null
JWT_SECRET_KEY=super-secret-jwt-token-with-at-least-32-characters-long
AUTHENTICATE=true
Expand Down
1,872 changes: 959 additions & 913 deletions CHANGELOG.md

Large diffs are not rendered by default.

24 changes: 24 additions & 0 deletions backend/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: check-added-large-files
- id: check-toml
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
# Check poetry state
- repo: https://github.com/python-poetry/poetry
rev: "1.5.1"
hooks:
- id: poetry-check
args: ["-C", "./backend"]
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.8
hooks:
# Run the linter.
- id: ruff
args: [--fix]
# Run the formatter.
- id: ruff-format
30 changes: 20 additions & 10 deletions backend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ RUN apt-get clean && apt-get update && apt-get install -y \
pandoc && \
rm -rf /var/lib/apt/lists/*

# Add Rust binaries to the PATH
ENV PATH="/root/.cargo/bin:${PATH}"

RUN ARCHITECTURE=$(uname -m) && \
if [ "$ARCHITECTURE" = "x86_64" ]; then \
Expand All @@ -46,19 +44,31 @@ RUN ARCHITECTURE=$(uname -m) && \
fi && \
rm -rf /var/lib/apt/lists/*

RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python && \
cd /usr/local/bin && \
ln -s /opt/poetry/bin/poetry && \
poetry config virtualenvs.create false


# Add Rust binaries to the PATH
ENV PATH="/root/.cargo/bin:${PATH}" \
POETRY_CACHE_DIR=/tmp/poetry_cache \
PYTHONDONTWRITEBYTECODE=1

WORKDIR /code

# Copy just the requirements first
COPY ./requirements.txt .
# Copy pyproject and poetry
COPY ./pyproject.toml ./poetry.lock* /code/

# Run install
RUN poetry install --no-root && \
playwright install --with-deps && \
rm -rf $POETRY_CACHE_DIR

# Upgrade pip and install dependencies
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r requirements.txt && \
playwright install --with-deps

ENV PYTHONPATH=/code

# Copy the rest of the application
COPY . .

EXPOSE 5050

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "5050", "--workers", "6"]
34 changes: 18 additions & 16 deletions backend/Dockerfile.dev
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,8 @@ RUN apt-get clean && apt-get update && apt-get install -y \
libcurl4-openssl-dev \
libssl-dev \
binutils \
pandoc \
curl \
git \
poppler-utils \
tesseract-ocr \
autoconf \
automake \
build-essential \
Expand All @@ -31,24 +28,29 @@ RUN apt-get clean && apt-get update && apt-get install -y \
pandoc && \
rm -rf /var/lib/apt/lists/* && apt-get clean

# TODO(@aminediro) : multistage build. Probably dont neet poetry once its built
# Install Poetry
RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python && \
cd /usr/local/bin && \
ln -s /opt/poetry/bin/poetry && \
poetry config virtualenvs.create false

# Add Rust binaries to the PATH
ENV PATH="/root/.cargo/bin:${PATH}"
ENV PATH="/root/.cargo/bin:${PATH}" \
POETRY_CACHE_DIR=/tmp/poetry_cache \
PYTHONDONTWRITEBYTECODE=1

# Copy just the requirements first
COPY ./requirements.txt .
WORKDIR /code

# Upgrade pip
RUN pip install --upgrade pip
# Copy pyproject and poetry
COPY ./pyproject.toml ./poetry.lock* /code/

# Increase timeout to wait for the new installation
RUN pip install --no-cache-dir -r requirements.txt --timeout 200 && \
playwright install --with-deps
# Run install
RUN poetry install --no-root && \
playwright install --with-deps && \
rm -rf $POETRY_CACHE_DIR

WORKDIR /code
# Copy the rest of the application
COPY . .

ENV PYTHONPATH=/code

EXPOSE 5050

CMD ["uvicorn", "main:app","--reload", "--host", "0.0.0.0", "--port", "5050", "--workers", "6"]
1 change: 1 addition & 0 deletions backend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Quivr backend
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -40,5 +40,4 @@
else:
raise ValueError(f"Unsupported broker URL: {CELERY_BROKER_URL}")


celery.autodiscover_tasks(["modules.sync", "modules", "middlewares", "packages"])
celery.autodiscover_tasks(["backend.modules.sync.tasks"])
42 changes: 22 additions & 20 deletions backend/celery_worker.py → backend/backend/celery_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,27 @@
from uuid import UUID

from celery.schedules import crontab
from celery_config import celery
from logger import get_logger
from middlewares.auth.auth_bearer import AuthBearer
from models.files import File
from models.settings import get_supabase_client, get_supabase_db
from modules.brain.integrations.Notion.Notion_connector import NotionConnector
from modules.brain.service.brain_service import BrainService
from modules.brain.service.brain_vector_service import BrainVectorService
from modules.notification.dto.inputs import NotificationUpdatableProperties
from modules.notification.entity.notification import NotificationsStatusEnum
from modules.notification.service.notification_service import NotificationService
from modules.onboarding.service.onboarding_service import OnboardingService
from packages.files.crawl.crawler import CrawlWebsite, slugify
from packages.files.parsers.github import process_github
from packages.files.processors import filter_file
from packages.utils.telemetry import maybe_send_telemetry
from pytz import timezone

from backend.celery_config import celery
from backend.logger import get_logger
from backend.middlewares.auth.auth_bearer import AuthBearer
from backend.models.files import File
from backend.models.settings import get_supabase_client, get_supabase_db
from backend.modules.brain.integrations.Notion.Notion_connector import NotionConnector
from backend.modules.brain.service.brain_service import BrainService
from backend.modules.brain.service.brain_vector_service import BrainVectorService
from backend.modules.notification.dto.inputs import NotificationUpdatableProperties
from backend.modules.notification.entity.notification import NotificationsStatusEnum
from backend.modules.notification.service.notification_service import (
NotificationService,
)
from backend.modules.onboarding.service.onboarding_service import OnboardingService
from backend.packages.files.crawl.crawler import CrawlWebsite, slugify
from backend.packages.files.parsers.github import process_github
from backend.packages.files.processors import filter_file
from backend.packages.utils.telemetry import maybe_send_telemetry

logger = get_logger(__name__)

onboardingService = OnboardingService()
Expand Down Expand Up @@ -64,7 +67,7 @@ def process_file_and_notify(
file_original_name, only_vectors=True
)

message = filter_file(
filter_file(
file=file_instance,
brain_id=brain_id,
original_file_name=file_original_name,
Expand Down Expand Up @@ -102,7 +105,6 @@ def process_crawl_and_notify(
brain_id: UUID,
notification_id=None,
):

crawl_website = CrawlWebsite(url=crawl_website_url)

if not crawl_website.checkGithub():
Expand All @@ -123,7 +125,7 @@ def process_crawl_and_notify(
file_size=len(extracted_content),
file_extension=".txt",
)
message = filter_file(
filter_file(
file=file_instance,
brain_id=brain_id,
original_file_name=crawl_website_url,
Expand All @@ -136,7 +138,7 @@ def process_crawl_and_notify(
),
)
else:
message = process_github(
process_github(
repo=crawl_website.url,
brain_id=brain_id,
)
Expand Down
File renamed without changes.
42 changes: 21 additions & 21 deletions backend/main.py → backend/backend/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,29 +6,30 @@
from dotenv import load_dotenv # type: ignore
from fastapi import FastAPI, HTTPException, Request
from fastapi.responses import HTMLResponse, JSONResponse
from logger import get_logger
from middlewares.cors import add_cors_middleware
from modules.analytics.controller.analytics_routes import analytics_router
from modules.api_key.controller import api_key_router
from modules.assistant.controller import assistant_router
from modules.brain.controller import brain_router
from modules.chat.controller import chat_router
from modules.contact_support.controller import contact_router
from modules.knowledge.controller import knowledge_router
from modules.misc.controller import misc_router
from modules.onboarding.controller import onboarding_router
from modules.prompt.controller import prompt_router
from modules.sync.controller import sync_router
from modules.upload.controller import upload_router
from modules.user.controller import user_router
from packages.utils import handle_request_validation_error
from packages.utils.telemetry import maybe_send_telemetry
from pyinstrument import Profiler
from routes.crawl_routes import crawl_router
from routes.subscription_routes import subscription_router
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.starlette import StarletteIntegration

from backend.logger import get_logger
from backend.middlewares.cors import add_cors_middleware
from backend.modules.analytics.controller.analytics_routes import analytics_router
from backend.modules.api_key.controller import api_key_router
from backend.modules.assistant.controller import assistant_router
from backend.modules.brain.controller import brain_router
from backend.modules.chat.controller import chat_router
from backend.modules.contact_support.controller import contact_router
from backend.modules.knowledge.controller import knowledge_router
from backend.modules.misc.controller import misc_router
from backend.modules.onboarding.controller import onboarding_router
from backend.modules.prompt.controller import prompt_router
from backend.modules.sync.controller import sync_router
from backend.modules.upload.controller import upload_router
from backend.modules.user.controller import user_router
from backend.packages.utils import handle_request_validation_error
from backend.packages.utils.telemetry import maybe_send_telemetry
from backend.routes.crawl_routes import crawl_router
from backend.routes.subscription_routes import subscription_router

load_dotenv()

# Set the logging level for all loggers to WARNING
Expand Down Expand Up @@ -68,7 +69,6 @@ def before_send(event, hint):
)

app = FastAPI()

add_cors_middleware(app)

app.include_router(brain_router)
Expand Down Expand Up @@ -129,4 +129,4 @@ async def http_exception_handler(_, exc):
# run main.py to debug backend
import uvicorn

uvicorn.run(app, host="0.0.0.0", port=5050, log_level="warning", access_log=False)
uvicorn.run(app, host="0.0.0.0", port=5050, log_level="debug", access_log=False)
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@

from fastapi import Depends, HTTPException, Request
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from middlewares.auth.jwt_token_handler import decode_access_token, verify_token
from modules.api_key.service.api_key_service import ApiKeyService
from modules.user.entity.user_identity import UserIdentity

from backend.middlewares.auth.jwt_token_handler import decode_access_token, verify_token
from backend.modules.api_key.service.api_key_service import ApiKeyService
from backend.modules.user.entity.user_identity import UserIdentity

api_key_service = ApiKeyService()

Expand Down Expand Up @@ -54,7 +55,7 @@ async def authenticate(

def get_test_user(self) -> UserIdentity:
return UserIdentity(
email="[email protected]", id="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" # type: ignore
email="[email protected]", id="39418e3b-0258-4452-af60-7acfcc1263ff" # type: ignore
) # replace with test user information


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@

from jose import jwt
from jose.exceptions import JWTError
from modules.user.entity.user_identity import UserIdentity

from backend.modules.user.entity.user_identity import UserIdentity

SECRET_KEY = os.environ.get("JWT_SECRET_KEY")
ALGORITHM = "HS256"
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
from uuid import UUID

from logger import get_logger
from pydantic import ConfigDict, BaseModel
from pydantic import BaseModel, ConfigDict

from backend.logger import get_logger

logger = get_logger(__name__)

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from pydantic import BaseModel


class LLMModels(BaseModel):
class LLMModel(BaseModel):
"""LLM models stored in the database that are allowed to be used by the users.
Args:
BaseModel (BaseModel): Pydantic BaseModel
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from datetime import datetime
from uuid import UUID

from .llm_models import LLMModels
from .llm_models import LLMModel


class Repository(ABC):
Expand All @@ -15,7 +15,7 @@ def get_user_usage(self, user_id: UUID):
pass

@abstractmethod
def get_model_settings(self) -> LLMModels | None:
def get_models(self) -> LLMModel | None:
pass

@abstractmethod
Expand Down
6 changes: 6 additions & 0 deletions backend/backend/models/databases/supabase/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from backend.models.databases.supabase.brains_subscription_invitations import (
BrainSubscription,
)
from backend.models.databases.supabase.files import File
from backend.models.databases.supabase.user_usage import UserUsage
from backend.models.databases.supabase.vectors import Vector
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
from models.databases.repository import Repository

from logger import get_logger
from backend.logger import get_logger
from backend.models.databases.repository import Repository

logger = get_logger(__name__)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from models.databases.repository import Repository
from backend.models.databases.repository import Repository


class File(Repository):
Expand Down
Loading

0 comments on commit ca93cb9

Please sign in to comment.