DocumentsGPT

Chat with your documents using embeddings and gpt-3.5-turbo by OpenAI

Introduction

This project allows you to chat with your documents, books, scripts, etc., as long as they are in the .pdf format. To acquire this capability, the project uses embeddings (generated with the OpenAI API) and feeds them back through a ConversationalRetrieverChain provided by Langchain. This way, you are able to pass the embeddings with your prompt to OpenAI and send specific prompts about your document, such as 'Write a summary about chapter XY' or 'Generate a short summary of chapter XY in bullet points for a PowerPoint presentation'.

Installation

Clone the Project, and install the requirements using the following snippet

$ git clone [email protected]:jimmymeister98/DocsGPT.git
$ cd DocsGPT
$ pip install -r requirements.txt

(Eventually install additional requirements thrown in the console on runtime, which i didnt track yet)
Get your OpenAI API key and paste it into the .env file (Embedding and Prompting costs are listed on the OpenAI Pricing page the models used are "text-embedding-ada-002-v2" and "gpt-3.5-turbo")
Paste the path of your file in the variable pdf_path in main.py (subject to change, will probably change to a prompt in near future)
run with python main.py or by clicking the run button in the IDE of your choice

Usage

$ python main.py

Features

Persistent Vector Stores: Vector stores are stored locally, so embeddings only need to be generated once per document.
- Deep Lake: You can either store your Vector Stores locally or visualize and host them on ActiveLoop with minimal changes. The ability to switch between local and remote storage will be added later.
Prompt Chaining: With the use of history-aware prompting, it is possible to chain prompts together. For example: "...add a short summarizing sentence to the previous prompt"

Ideas

Local Embedding: Create embeddings locally using LLAMA and a fitting model (7b for the beginning)
Local Prompting: Prompt against a local LLM like GPT4ALL, Vicuna or LLAMA
- (Both will need rather beefy hardware, but will cut the cost of embedding and prompting)
Add prompting for files and increase prompting in general
Add a gpt-like web ui
Switch between local and remote saving of embeddings

Contributing

Fork the repository.
Create a new branch: git checkout -b feature-branch.
Make your changes and commit them: git commit -m 'Add new feature'.
Push to the branch: git push origin feature-branch.
Submit a pull request.

License

This project is licensed under the To be licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocumentsGPT

Table of Contents

Introduction

Installation

Usage

Features

Ideas

Contributing

License

About

Releases

Packages

Languages

jimmymeister98/DocumentsGPT

Folders and files

Latest commit

History

Repository files navigation

DocumentsGPT

Table of Contents

About

Resources

Stars

Watchers

Forks

Languages