Skip to content

Commit

Permalink
Update to new PromptingTools RAG (#10)
Browse files Browse the repository at this point in the history
  • Loading branch information
svilupp authored Apr 18, 2024
1 parent a40ea4e commit 730f5ec
Show file tree
Hide file tree
Showing 15 changed files with 717 additions and 336 deletions.
48 changes: 48 additions & 0 deletions Artifacts.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,51 @@ lazy = true
[[juliaextra.download]]
sha256 = "61133afa7e06fda133f07164c57190a5b922f8f2a1aa17c3f8a628b5cf752512"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/juliaextra__v1.10.0__ada1.0.tar.gz"

["julia__textembedding3large-0-Float32"]
git-tree-sha1 = "a105a2482296fa0a80ce0c76677cc9ef673be70e"
lazy = true

[["julia__textembedding3large-0-Float32".download]]
sha256 = "ff4e91908fb54b7919aad9d6a2ac5045124d43eb864fe9f96a7a68d304d4e0a2"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/julia__v1.10.2__textembedding3large-0-Float32__v1.0.tar.gz"

["julia__textembedding3large-1024-Bool"]
git-tree-sha1 = "7eef82f15c72712b4f5fff2449ebf3ed64b56b14"
lazy = true

[["julia__textembedding3large-1024-Bool".download]]
sha256 = "27186886d19ea4c3f1710b4bc70e8e809d906069d5de8c992c948d97d0f454da"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/julia__v1.10.2__textembedding3large-1024-Bool__v1.0.tar.gz"

["tidier__textembedding3large-0-Float32"]
git-tree-sha1 = "680c7035e512844fd2b9af1757b02b931dfadaa5"
lazy = true

[["tidier__textembedding3large-0-Float32".download]]
sha256 = "59eb6fef198e32d238c11d3a95e5201d18cb83c5d42eae753706614c0f72db9e"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__textembedding3large-0-Float32__v1.0.tar.gz"

["tidier__textembedding3large-1024-Bool"]
git-tree-sha1 = "44d861977d663a9c4615023ae38828e0ef88036e"
lazy = true

[["tidier__textembedding3large-1024-Bool".download]]
sha256 = "226cadd2805abb6ab6e561330aca97466e0a2cb1e1eb171be661d9dea9dcacdc"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__textembedding3large-1024-Bool__v1.0.tar.gz"

["makie__textembedding3large-0-Float32"]
git-tree-sha1 = "30c29c10d9b2b160b43f358fad9f4f6fe83ce378"
lazy = true

[["makie__textembedding3large-0-Float32".download]]
sha256 = "ee15489022df191fbede93adf1bd7cc1ceb1f84185229026a5e38ae9a3fab737"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__textembedding3large-0-Float32__v1.0.tar.gz"

["makie__textembedding3large-1024-Bool"]
git-tree-sha1 = "a49a86949f86f6cf4c29bdc9559c05064b49c801"
lazy = true

[["makie__textembedding3large-1024-Bool".download]]
sha256 = "135f36effc0d29ed20e9bc877f727e4d9d8366bcae4bf4d13f998529d1091324"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__textembedding3large-1024-Bool__v1.0.tar.gz"
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- (Preliminary) Knowledge packs available for Julia docs (`:julia`), Tidier ecosystem (`:tidier`), Makie ecosystem (`:makie`). Load with `load_index!(:julia)` or several with `load_index!([:julia, :tidier])`.

### Changed
- Bumped up PromptingTools to v0.20 (brings new RAG capabilities, pretty-printing, etc.)
- Changed default model to be GPT-4 Turbo to improve answer quality

### Fixed
- Fixed wrong initiation of `CONV_HISTORY` and other globals that led to UndefVarError. Moved several globals to `const Ref{}` pattern to ensure type stability, but it means that from now it always needs to be dereferenced with `[]` (eg, `MAIN_INDEX[]` instead of `MAIN_INDEX`).
Expand Down
5 changes: 4 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,11 @@ authors = ["J S <[email protected]> and contributors"]
version = "0.0.1-DEV"

[deps]
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
LazyArtifacts = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Preferences = "21216c6a-2e73-6563-6e65-726566657250"
PromptingTools = "670122d1-24a8-4d70-bfce-740807c42192"
REPL = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
Expand All @@ -20,7 +23,7 @@ JSON3 = "1"
LazyArtifacts = "<0.0.1, 1"
LinearAlgebra = "<0.0.1, 1"
Preferences = "1"
PromptingTools = "0.9"
PromptingTools = "0.20"
REPL = "1"
SHA = "0.7"
Serialization = "<0.0.1, 1"
Expand Down
70 changes: 60 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@

AIHelpMe harnesses the power of Julia's extensive documentation and advanced AI models to provide tailored coding guidance. By integrating with PromptingTools.jl, it offers a unique, AI-assisted approach to answering your coding queries directly in Julia's environment.

Note: This is only a proof-of-concept. If there is enough interest, we will fine-tune the RAG pipeline for better performance.
> [!CAUTION]
> This is only a proof-of-concept. If there is enough interest, we will fine-tune the RAG pipeline for better performance.
## Features

Expand All @@ -27,7 +28,8 @@ Pkg.add(url="https://github.com/svilupp/AIHelpMe.jl")

- Julia (version 1.10 or later).
- Internet connection for API access.
- OpenAI and Cohere API keys (recommended for optimal performance). See [How to Obtain API Keys](#how-to-obtain-api-keys).
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
- For optimal performance, get also Cohere API key (free for community use) and Tavily API key (free for community use).

All setup should take less than 5 minutes!

Expand All @@ -40,10 +42,43 @@ All setup should take less than 5 minutes!
```

```plaintext
[ Info: Done generating response. Total cost: $0.001
[ Info: Done generating response. Total cost: $0.015
AIMessage("To implement quicksort in Julia, you can use the `sort` function with the `alg=QuickSort` argument.")
```

Note: As a default, we load only the Julia documentation and docstrings for standard libraries. The default model used is GPT-4 Turbo.

You can pretty-print the answer using `pprint` if you return the full RAGResult (`return_all=true`):
```julia
using AIHelpMe: pprint
result = aihelp("How do I implement quicksort in Julia?", return_all=true)
pprint(result)
```

```plaintext
--------------------
QUESTION(s)
--------------------
- How do I implement quicksort in Julia?
--------------------
ANSWER
--------------------
To implement quicksort in Julia, you can use the [5,1.0]`sort`[1,1.0] function with the [1,1.0]`alg=QuickSort`[1,1.0] argument.[2,1.0]
--------------------
SOURCES
--------------------
1. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
2. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
3. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Algorithms
4. SortingAlgorithms::/README.md::0::SortingAlgorithms
5. AIHelpMe::/README.md::0::AIHelpMe
```

Note: You can see the model cheated because it can see this very documentation...

2. **`aihelp` Macro**:
```julia
aihelp"how to implement quicksort in Julia?"
Expand All @@ -56,11 +91,12 @@ All setup should take less than 5 minutes!
Note: The `!` is required for follow-up questions.
`aihelp!` does not add new context/more information - to do that, you need to ask a new question.

4. **Pick stronger models**:
Eg, "gpt4t" is an alias for GPT-4 Turbo:
4. **Pick faster models**:
Eg, for simple questions, GPT 3.5 might be enough, so use the alias "gpt3t":
```julia
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt4t
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt3t
```

```plaintext
[ Info: Done generating response. Total cost: $0.002 -->
AIMessage("The `sort` function in programming languages, including Julia.... continues for a while!
Expand All @@ -69,22 +105,36 @@ All setup should take less than 5 minutes!
5. **Debugging**:
How did you come up with that answer? Check the "context" provided to the AI model (ie, the documentation snippets that were used to generate the answer):
```julia
const AHM = AIHelpMe
AHM.preview_context()
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[])
# Output: Pretty-printed Question + Context + Answer with color highlights
```

The color highlights show you which words were NOT supported by the provided context (magenta = completely new, blue = partially new).
It's a quite and intuitive way to see which function names or variables are made up versus which ones were in the context.

You can change the kwargs of `pprint` to hide the annotations or potentially even show the underlying context (snippets from the documentation):

```julia
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[]; add_context = true, add_scores = false)
```

## How to Obtain API Keys

### OpenAI API Key:
1. Visit [OpenAI's API portal](https://openai.com/api/).
2. Sign up and generate an API Key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Charge some credits ($5 minimum but that will last you for a lost time).
4. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Cohere API Key:
1. Sign up at [Cohere's registration page](https://dashboard.cohere.com/welcome/register).
2. After registering, visit the [API keys section](https://dashboard.cohere.com/api-keys) to obtain a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Tavily API Key:
1. Sign up at [Tavily](https://app.tavily.com/sign-in).
2. After registering, generate an API key on the [Overview page](https://app.tavily.com/home). You can get a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

## Usage

Expand Down
69 changes: 59 additions & 10 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,15 @@ To install AIHelpMe, use the Julia package manager and the address of the reposi

```julia
using Pkg
Pkg.add("https://github.com/svilupp/AIHelpMe.jl")
Pkg.add(url="https://github.com/svilupp/AIHelpMe.jl")
```

**Prerequisites:**

- Julia (version 1.10 or later).
- Internet connection for API access.
- OpenAI and Cohere API keys (recommended for optimal performance). See [How to Obtain API Keys](#how-to-obtain-api-keys).
- OpenAI API keys with available credits. See [How to Obtain API Keys](#how-to-obtain-api-keys).
- For optimal performance, get also Cohere API key (free for community use) and Tavily API key (free for community use).

All setup should take less than 5 minutes!

Expand All @@ -44,10 +45,43 @@ All setup should take less than 5 minutes!
```

```plaintext
[ Info: Done generating response. Total cost: $0.001
[ Info: Done generating response. Total cost: $0.015
AIMessage("To implement quicksort in Julia, you can use the `sort` function with the `alg=QuickSort` argument.")
```

Note: As a default, we load only the Julia documentation and docstrings for standard libraries. The default model used is GPT-4 Turbo.

You can pretty-print the answer using `pprint` if you return the full RAGResult (`return_all=true`):
```julia
using AIHelpMe: pprint
result = aihelp("How do I implement quicksort in Julia?", return_all=true)
pprint(result)
```

```plaintext
--------------------
QUESTION(s)
--------------------
- How do I implement quicksort in Julia?
--------------------
ANSWER
--------------------
To implement quicksort in Julia, you can use the [5,1.0]`sort`[1,1.0] function with the [1,1.0]`alg=QuickSort`[1,1.0] argument.[2,1.0]
--------------------
SOURCES
--------------------
1. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
2. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Functions
3. https://docs.julialang.org/en/v1.10.2/base/sort/index.html::Sorting and Related Functions/Sorting Algorithms
4. SortingAlgorithms::/README.md::0::SortingAlgorithms
5. AIHelpMe::/README.md::0::AIHelpMe
```

Note: You can see the model cheated because it can see this very documentation...

2. **`aihelp` Macro**:
```julia
aihelp"how to implement quicksort in Julia?"
Expand All @@ -60,11 +94,12 @@ All setup should take less than 5 minutes!
Note: The `!` is required for follow-up questions.
`aihelp!` does not add new context/more information - to do that, you need to ask a new question.

4. **Pick stronger models**:
Eg, "gpt4t" is an alias for GPT-4 Turbo:
4. **Pick faster models**:
Eg, for simple questions, GPT 3.5 might be enough, so use the alias "gpt3t":
```julia
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt4t
aihelp"Elaborate on the `sort` function and quicksort algorithm"gpt3t
```

```plaintext
[ Info: Done generating response. Total cost: $0.002 -->
AIMessage("The `sort` function in programming languages, including Julia.... continues for a while!
Expand All @@ -73,22 +108,36 @@ All setup should take less than 5 minutes!
5. **Debugging**:
How did you come up with that answer? Check the "context" provided to the AI model (ie, the documentation snippets that were used to generate the answer):
```julia
const AHM = AIHelpMe
AHM.preview_context()
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[])
# Output: Pretty-printed Question + Context + Answer with color highlights
```

The color highlights show you which words were NOT supported by the provided context (magenta = completely new, blue = partially new).
It's a quite and intuitive way to see which function names or variables are made up versus which ones were in the context.

You can change the kwargs of `pprint` to hide the annotations or potentially even show the underlying context (snippets from the documentation):

```julia
AIHelpMe.pprint(AIHelpMe.LAST_RESULT[]; add_context = true, add_scores = false)
```

## How to Obtain API Keys

### OpenAI API Key:
1. Visit [OpenAI's API portal](https://openai.com/api/).
2. Sign up and generate an API Key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Charge some credits ($5 minimum but that will last you for a lost time).
4. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Cohere API Key:
1. Sign up at [Cohere's registration page](https://dashboard.cohere.com/welcome/register).
2. After registering, visit the [API keys section](https://dashboard.cohere.com/api-keys) to obtain a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://svilupp.github.io/PromptingTools.jl/dev/frequently_asked_questions/#Configuring-the-Environment-Variable-for-API-Key).
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

### Tavily API Key:
1. Sign up at [Tavily](https://app.tavily.com/sign-in).
2. After registering, generate an API key on the [Overview page](https://app.tavily.com/home). You can get a free, rate-limited Trial key.
3. Set it as an environment variable or a local preference in PromptingTools.jl. See the [instructions](https://siml.earth/PromptingTools.jl/dev/frequently_asked_questions#Configuring-the-Environment-Variable-for-API-Key).

## Usage

Expand Down
31 changes: 21 additions & 10 deletions src/AIHelpMe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,46 @@ using Preferences, Serialization, LinearAlgebra, SparseArrays
using LazyArtifacts
using Base.Docs: DocStr, MultiDoc, doc, meta
using REPL: stripmd
using HDF5

using PromptingTools
using PromptingTools: pprint
using PromptingTools.Experimental.RAGTools
using PromptingTools.Experimental.RAGTools: AbstractRAGConfig, getpropertynested,
setpropertynested, merge_kwargs_nested
using SHA: sha256, bytes2hex
using Logging, PrecompileTools
const PT = PromptingTools
const RAG = PromptingTools.Experimental.RAGTools
const RT = PromptingTools.Experimental.RAGTools

## export load_index!, last_context, update_index!
## export remove_pkgdir, annotate_source, find_new_chunks
include("utils.jl")

## Globals and types are defined in here
include("pipeline_defaults.jl")

## export docdata_to_source, docextract, build_index
include("preparation.jl")

## export load_index!, update_index!
include("loading.jl")

export aihelp
include("generation.jl")

export @aihelp_str, @aihelp!_str
include("macros.jl")

## Globals
const CONV_HISTORY = Vector{Vector{PT.AbstractMessage}}()
const CONV_HISTORY_LOCK = ReentrantLock()
const MAX_HISTORY_LENGTH = 1
const LAST_CONTEXT = Ref{Union{Nothing, RAG.RAGContext}}(nothing)
const MAIN_INDEX = Ref{Union{Nothing, RAG.AbstractChunkIndex}}(nothing)
function __init__()
## Load index
MAIN_INDEX[] = load_index!()
## Set the active configuration
update_pipeline!(:bronze)
## Load index - auto-loads into MAIN_INDEX
load_index!(:julia)
end

# Enable precompilation to reduce start time, disabled logging
with_logger(NullLogger()) do
@compile_workload include("precompilation.jl")
end

end
Loading

0 comments on commit 730f5ec

Please sign in to comment.