Merge branch 'main' into dev/ben-390

plastic-labs · Oct 17, 2024 · 7025c2a · 7025c2a
2 parents 9d10319 + 01c1716
commit 7025c2a
Show file tree

Hide file tree

Showing 51 changed files with 10,082 additions and 5,773 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# tutor-gpt
+# Tutor-GPT
 
 ![Static Badge](https://img.shields.io/badge/Version-0.6.0-blue)
 [![Discord](https://img.shields.io/discord/1076192451997474938?logo=discord&logoColor=%23ffffff&label=Bloom&labelColor=%235865F2)](https://discord.gg/bloombotai)
@@ -7,19 +7,19 @@
 [![X (formerly Twitter) URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2FBloomBotAI&label=Twitter)](https://twitter.com/BloomBotAI)
 [![arXiv](https://img.shields.io/badge/arXiv-2310.06983-b31b1b.svg)](https://arxiv.org/abs/2310.06983)
 
-Tutor-GPT is a LangChain LLM application developed by [Plastic
+Tutor-GPT is an LLM powered learning companion developed by [Plastic
 Labs](https://plasticlabs.ai). It dynamically reasons about your learning needs
 and _updates its own prompts_ to best serve you.
 
 We leaned into theory of mind experiments and it is now more than just a
 literacy tutor, it’s an expansive learning companion. Read more about how it
 works [here](https://blog.plasticlabs.ai/blog/Theory-of-Mind-Is-All-You-Need).
 
+Tutor-GPT is powered by [Honcho](https://honcho.dev) to build robust user
+representations and create a personalized experience for each user.
+
 The hosted version of `tutor-gpt` is called [Bloom](https://bloombot.ai) as a
-nod to Benjamin Bloom's Two Sigma Problem. You can try the web version at
-[chat.bloombot.ai](https://chat.bloombot.ai) or you can join our
-[Discord](https://discord.gg/bloombotai) to try out our implementation for free
-(while our OpenAI spend lasts 😄).
+nod to Benjamin Bloom's Two Sigma Problem.
 
 Alternatively, you can run your own instance of the bot by following the
 instructions below.
@@ -31,52 +31,53 @@ the backend logic for different clients.
 
 - `agent/` - this contains the core logic and prompting architecture
 - `bot/` - this contains the discord bot implementation
-- `api/` - this contains an API interface to the tutor-gpt backend
+- `api/` - this contains a FastAPI API interface that exposes the `agent/` logic
 - `www/` - this contains a `NextJS` web front end that can connect to the API interface
 - `common/` - this contains common used in different interfaces
 - `supabase/` - contains SQL scripts necessary for setting up local supabase
 
 Most of the project is developed using python with the exception of the NextJS
-application. For python `poetry` is used for dependency management and for the
-web interface `yarn` is used.
-
-### Supabase
-
-Additionally, this project uses supabase for managing different users,
-authentication, and as the database for holding message and conversation
-information. We recommend for testing and local development to use a local instance of supabase. The supabase-cli is the best way to do this.
-
-Follow the [Supabase Documentation](https://supabase.com/docs/guides/cli/local-development) for more information. The project contains a `supabase/` folder that contains the scaffolding SQL migrations necessary for setting up the necessary tables. Once you have the supabase cli installed you can simply run the below command in the `tutor-gpt` folder and a local instance of Supabase will start up.
+application. For python [`uv`](https://docs.astral.sh/uv/) is used for dependency management and for the
+web interface we use `pnpm`.
 
-> NOTE: Local Supabase relies on docker so ensure docker is also running before running the below command
+The `bot/` and `api/` modules both use `agent/` as a dependency and load it as a
+local package using `uv`
 
-```bash
-supabase start
-```
+> NOTE
+> More information about the web interface is available in
+> [www/README](./www/README.md) this README primarily contains information about
+> the backend of tutor-gpt and the core logic of the tutor
 
-Another, useful note about doing testing locally with supabase is that there is
-no need to verify an account when it is created so you can create a new account
-on the webui and then immediately sign in with it.
+The `agent`, `bot`, and `api` modules are all managed using a `uv` [workspace](https://docs.astral.sh/uv/concepts/workspaces/#getting-started)
 
 ## Installation
 
-> NOTE: The project uses
-> [poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)
-> and [yarn](https://yarnpkg.com/getting-started/install) for package
-> management.
+This section goes over how to setup a python environment for running Tutor-GPT.
+This will let you run the discord bot, run the FastAPI application, or develop the `agent`
+code.
 
 The below commands will install all the dependencies necessary for running the
-tutor-gpt project. We recommend using poetry to setup a virtual environment for
+tutor-gpt project. We recommend using uv to setup a virtual environment for
 the project.
 
 ```bash
-git clone https://github.com/plastic-labs/tutor-gpt.git
-cd tutor-gpt
-poetry install # install Python dependencies
-cd www/
-yarn install # install all NodeJS dependencies
+git clone https://github.com/plastic-labs/tutor-gpt.git && cd tutor-gpt
+uv sync # set up the workspace
+source .venv/bin/activate # activate the virtual environment
 ```
 
+From here you will then need to run `uv sync` in the appropriate directory
+depending on what you part of the project you want to run. For example to run
+the FastAPI application you need to navigate to the directory an re-run sync
+
+```bash
+cd api/
+uv sync
+```
+
+You should see a message indicated that the depenedencies were resolved and/or
+installed if not already installed before.
+
 ### Docker
 
 Alternatively (The recommended way) this project can be built and run with
@@ -96,91 +97,55 @@ docker build -t tutor-gpt-core .
 
 Similarly, to build the web interface run the below commands
 
-```bash
-cd tutor-gpt/www
-docker build -t tutor-gpt-web .
-```
-
-> NOTE: for poetry usage
-
-This project uses [poetry](https://python-poetry.org/) to manage dependencies.
-To install dependencies locally run `poetry install`. Or alternatively run
-`poetry shell` to activate the virtual environment
+## Usage
 
-To activate the virtual environment within the same shell you can use the
-following one-liner:
+Each of the interfaces of tutor-gpt require different environment variables to
+operate properly. Both the `bot/` and `api/` modules contain a `.env.template`
+file that you can use as a starting point. Copy and rename the `.env.template`
+to `.env`
 
-```bash
-source $(poetry env info --path)/bin/activate
-```
+Below are more detailed explanations of environment variables
 
-On some systems this may not detect the proper virtual environment. You can
-diagnose this by running `poetry env info` directly to see if the virtualenv
-is defined.
+### Common
 
-If using `pyenv` remember to set **prefer-active-python** to true. As per
-this section of the [documentation](https://python-poetry.org/docs/managing-environments/).
+**Azure Mirascope Keys**
 
-Another workaround that may work if the above setting does not work is to
-continue directly with `poetry shell` or wrap the source command like below
+- `AZURE_OPENAI_ENDPOINT` — The endpoint for the Azure OpenAI service
+- `AZURE_OPENAI_API_KEY` — The API key for the Azure OpenAI service
+- `AZURE_OPENAI_API_VERSION` — The API version for the Azure OpenAI service
+- `AZURE_OPENAI_DEPLOYMENT` — The deployment name for the Azure OpenAI service
 
-```bash
-poetry run source $(poetry env info --path)/bin/activate
-```
+### FastAPI
 
-## Usage
+**NextJS & fastAPI**
 
-This app requires you to have a few different environment variables set. Create
-a `.env` file from the `.env.template`. Depending on which interface you are
-running (web or discord) different variables are necessary. This is explained
-below
+- `URL` — The URL endpoint for the frontend Next.js application
+- `HONCHO_URL` — The base URL for the instance of Honcho you are using
+- `HONCHO_APP_NAME` — The name of the honcho application to use for Tutor-GPT
 
-### Required
+**Optional Extras**
 
-- **OPENAI_API_KEY**: Go to [OpenAI](https://beta.openai.com/account/api-keys) to generate your own API key.
-- **SUPABASE_URL**: The base URL for your supabase instance
-- **SUPABASE_KEY**: The API key for interacting with your supabase project. This corresponds to the service key, get it from your project settings
-- **CONVERSATION_TABLE**: the name of the table to hold conversation metadata
-- **MEMORY_TABLE**: the name of the table holding messages for different conversations
+- `SENTRY_DSN_API` — The Sentry DSN for optional error reporting
 
-### Discord Only
+### Discord
 
-- **BOT_TOKEN**: This is the discord bot token. You can find instructions on how
+- `BOT_TOKEN` — This is the discord bot token. You can find instructions on how
   to create a bot and generate a token in the [pycord
   docs](https://guide.pycord.dev/getting-started/creating-your-first-bot).
-- **THOUGHT_CHANNEL_ID**: This is the discord channel for the bot to output
+- `THOUGHT_CHANNEL_ID` — This is the discord channel for the bot to output
   thoughts to. Make a channel in your server and copy the ID by right clicking the
   channel and copying the link. The channel ID is the last string of numbers in
   the link.
 
-### Web Only
-
-- **URL**: the URL that the web ui is running from by default this should be http://localhost:3000
-
-### Web UI Environment
-
-The `NextJS` application in `www/` also has it's own environment variables which are usually held in the .env.local file. There is another `.env.template` file that you can use for getting started. These are explaing below.
-
-- **NEXT_PUBLIC_URL**: The url the web application will be accessible the default with `NextJS` is http://localhost:3000
-- **NEXT_PUBLIC_API_URL**: The url the api backend will be run from the default for `FastAPI is` http://localhost:8000
-- **NEXT_PUBLIC_SUPABASE_URL**: The url for your supabase project should be identical to the one used in the python backend
-- **NEXT_PUBLIC_SUPABASE_ANON_KEY**: The API key for supabase this time it is the anon key NOT the service key
-- **NEXT_PUBLIC_SENTRY_DSN**: Optional for sentry bug tracking
-- **NEXT_PUBLIC_SENTRY_ENVIRONMENT**: Optional for sentry bug tracking
-- **NEXT_PUBLIC_POSTHOG_KEY**: Optional Posthog event tracking
-- **NEXT_PUBLIC_POSTHOG_HOST**: Option for Posthog event tracking
-
----
-
 ### Docker/Containerization
 
 You can also optionally use the docker containers to run the application locally. Below is the command to run the discord bot locally using a `.env` file that is not within the docker container. Be careful not to add your `.env` in the docker container as this is insecure and can leak your secrets.
 
 ```bash
-docker run --env-file .env tutor-gpt-core python -u -m bot.app
+docker run --env-file .env tutor-gpt-core python bot/app.py
 ```
 
-To run the webui you need to run the backend `FastAPI` and the frontend `NexTJS` containers separately. In two separate terminal instances run the following commands to have both applications run.
+To run the webui you need to run the backend `FastAPI` and the frontend `NextJS` containers separately. In two separate terminal instances run the following commands to have both applications run.
 The current behaviour will utilize the `.env` file in your local repository and
 run the bot.
 
@@ -191,11 +156,6 @@ docker run tutor-gpt-web
 
 > NOTE: the default run command in the docker file for the core runs the FastAPI backend so you could just run docker run --env-file .env tutor-gpt-core
 
-### Architecture
-
-Below is high level diagram of the architecture for the bot.
-![Tutor-GPT Discord Architecture](<assets/ToM Chain Flow.png>)
-
 ## Contributing
 
 This project is completely open source and welcomes any and all open source contributions. The workflow for contributing is to make a fork of the repository. You can claim an issue in the issues tab or start a new thread to indicate a feature or bug fix you are working on.

diff --git a/agent/agent/chain.py b/agent/agent/chain.py
@@ -1,7 +1,7 @@
 from os import getenv
 from typing import List
 
-from openai import AzureOpenAI
+from openai import OpenAI
 from dotenv import load_dotenv
 
 from honcho import Honcho
@@ -30,13 +30,9 @@ def __init__(
 
     model_config = ConfigDict(arbitrary_types_allowed=True)
 
-    openai = AzureOpenAI(
-        api_key=getenv("AZURE_OPENAI_API_KEY", "placeholder"),
-        azure_endpoint=getenv("AZURE_OPENAI_ENDPOINT", "placeholder"),
-        api_version=getenv("AZURE_OPENAI_API_VERSION", "2024-02-01"),
-    )
+    openai = OpenAI()
 
-    model = getenv("AZURE_OPENAI_DEPLOYMENT", "placeholder")
+    model = "gpt-4o"
 
 
 class ThinkCall(HonchoCall):
@@ -82,9 +78,16 @@ def history(self) -> str:
                 continue
         return history_str
 
+    def call(self):
+        response = self.openai.chat.completions.create(
+            model=self.model,
+            messages=[self.template(), {"role": "user", "content": self.user_input}],
+        )
+        return response.choices[0].message
+
     def stream(self):
         completion = self.openai.chat.completions.create(
-            model=getenv("AZURE_OPENAI_DEPLOYMENT", "placeholder"),
+            model=self.model,
             messages=[self.template(), {"role": "user", "content": self.user_input}],
             stream=True,
         )
@@ -127,9 +130,16 @@ def history(self) -> List[dict]:
                 history_list.append({"role": "assistant", "content": message.content})
         return history_list
 
+    def call(self):
+        response = self.openai.chat.completions.create(
+            model=self.model,
+            messages=self.template(),
+        )
+        return response.choices[0].message
+
     def stream(self):
         completion = self.openai.chat.completions.create(
-            model=getenv("AZURE_OPENAI_DEPLOYMENT", "placeholder"),
+            model=self.model,
             messages=self.template(),
             stream=True,
         )

diff --git a/api/.env.template b/api/.env.template
@@ -1,12 +1,5 @@
-# Azure Mirascope Keys
-AZURE_OPENAI_ENDPOINT=
-AZURE_OPENAI_API_KEY=
-AZURE_OPENAI_API_VERSION=
-AZURE_OPENAI_DEPLOYMENT=
-
-# Supabase Settings
-SUPABASE_URL=
-SUPABASE_KEY=
+# OpenAI Keys
+OPENAI_API_KEY=
 
 # NextJS & fastAPI
 URL=http://localhost:3000

diff --git a/bot/.env.template b/bot/.env.template
@@ -1,8 +1,5 @@
-# Azure Mirascope Keys
-AZURE_OPENAI_ENDPOINT=
-AZURE_OPENAI_API_KEY=
-AZURE_OPENAI_API_VERSION=
-AZURE_OPENAI_DEPLOYMENT=
+# OpenAI Keys
+OPENAI_API_KEY=
 
 # Discord Settings
 BOT_TOKEN=

diff --git a/www/.eslintrc.json b/www/.eslintrc.json
diff --git a/www/Dockerfile b/www/Dockerfile
@@ -12,7 +12,7 @@ COPY package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./
 RUN \
   if [ -f yarn.lock ]; then yarn --frozen-lockfile; \
   elif [ -f package-lock.json ]; then npm ci; \
-  elif [ -f pnpm-lock.yaml ]; then yarn global add pnpm && pnpm i --frozen-lockfile; \
+  elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm i --frozen-lockfile; \
   else echo "Lockfile not found." && exit 1; \
   fi
 
@@ -27,16 +27,19 @@ COPY . .
 # Uncomment the following line in case you want to disable telemetry during the build.
 # ENV NEXT_TELEMETRY_DISABLED 1
 
-RUN yarn build
 
-# If using npm comment out above and use below instead
-# RUN npm run build
+RUN \
+  if [ -f yarn.lock ]; then yarn run build; \
+  elif [ -f package-lock.json ]; then npm run build; \
+  elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm run build; \
+  else echo "Lockfile not found." && exit 1; \
+  fi
 
 # Production image, copy all the files and run next
 FROM base AS runner
 WORKDIR /app
 
-ENV NODE_ENV production
+ENV NODE_ENV=production
 # Uncomment the following line in case you want to disable telemetry during runtime.
 # ENV NEXT_TELEMETRY_DISABLED 1
 
@@ -58,8 +61,8 @@ USER nextjs
 
 EXPOSE 3000
 
-ENV PORT 3000
+ENV PORT=3000
 # set hostname to localhost
-ENV HOSTNAME "0.0.0.0"
+ENV HOSTNAME="0.0.0.0"
 
 CMD ["node", "server.js"]