Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added max_new_tokens as a config option to llm yaml block #1317

Merged
merged 6 commits into from
Nov 26, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions fern/docs/pages/manual/settings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,19 @@ Missing variables with no default will produce an error.
```yaml
server:
port: ${PORT:8001}
```

## LLM config options
gianniacquisto marked this conversation as resolved.
Show resolved Hide resolved

The `llm` section of the settings allows for the following configurations:

- mode = how to run your llm
- max_new_tokens = this lets you configure the number of new tokens the llm will generate and add to the context window (by default Llama.cpp uses 256)
gianniacquisto marked this conversation as resolved.
Show resolved Hide resolved

Example:

```yaml
llm:
mode: local
max_new_tokens: 256
```
1 change: 1 addition & 0 deletions private_gpt/components/llm/llm_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ def __init__(self, settings: Settings) -> None:
self.llm = LlamaCPP(
model_path=str(models_path / settings.local.llm_hf_model_file),
temperature=0.1,
max_new_tokens=settings.llm.max_new_tokens,
# llama2 has a context window of 4096 tokens,
# but we set it lower to allow for some wiggle room
context_window=3900,
Expand Down
1 change: 1 addition & 0 deletions private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ class DataSettings(BaseModel):

class LLMSettings(BaseModel):
mode: Literal["local", "openai", "sagemaker", "mock"]
max_new_tokens: int
gianniacquisto marked this conversation as resolved.
Show resolved Hide resolved
gianniacquisto marked this conversation as resolved.
Show resolved Hide resolved


class VectorstoreSettings(BaseModel):
Expand Down
1 change: 1 addition & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ ui:

llm:
mode: local
max_new_tokens: 256
embedding:
# Should be matching the value above in most cases
mode: local
Expand Down
Loading