Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature(Client): Client-side LLM generation #154

Open
1 task
justinthelaw opened this issue Aug 10, 2023 · 2 comments
Open
1 task

Feature(Client): Client-side LLM generation #154

justinthelaw opened this issue Aug 10, 2023 · 2 comments
Assignees
Labels
feature New feature or request help wanted Extra attention is needed javascript Pull requests that update Javascript code

Comments

@justinthelaw
Copy link
Owner

justinthelaw commented Aug 10, 2023

Is your feature request related to a problem? Please describe.
Hosting an API via a separate server requires extra resources and configuration.

Describe the solution you'd like

  • Add Tensorflow.js to the client, and serve the model through the frontend browser, and host it via GitHub Pages as a solution.

Describe alternatives you've considered
Free hosting via different services has significant restrictions and adds some inflexible configuration.

Additional context
Hosting a purely frontend application allows us to build it and serve it on GitHub Pages. The GitHub Pages site can be used as a free-trial to users who want to use it without contributing to the project.

A new scheme of allowing users to access a dedicated API for bullet generation can be provided in a future release of the application. The scheme may include the fulfillment of some conditions like: 1) create an account linked to your GitHub or Gmail, 2) contribute code to the open source repo under your account, 3) contribute clean data to the open source repo under your account, 4) Buy us coffee under your account, etc.

@justinthelaw justinthelaw converted this from a draft issue Aug 10, 2023
@justinthelaw justinthelaw self-assigned this Aug 10, 2023
@justinthelaw justinthelaw added feature New feature or request help wanted Extra attention is needed javascript Pull requests that update Javascript code labels Aug 10, 2023
@ishaan-jaff
Copy link

Hi @justinthelaw I believe we can help with this issue. I’m the maintainer of LiteLLM https://github.com/BerriAI/litellm

TLDR:
We allow you to use any LLM as a drop in replacement for gpt-3.5-turbo.
You can use our proxy server for making your LLM calls if you don't want to spin up additional resources

Usage

This calls the provider API directly

from litellm import completion
import os
## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-key" # 
messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# falcon call
response = completion(model="falcon-40b", messages=messages)

@justinthelaw
Copy link
Owner Author

Hi @justinthelaw I believe we can help with this issue. I’m the maintainer of LiteLLM https://github.com/BerriAI/litellm

TLDR:
We allow you to use any LLM as a drop in replacement for gpt-3.5-turbo.
You can use our proxy server for making your LLM calls if you don't want to spin up additional resources

Usage

This calls the provider API directly

from litellm import completion
import os
## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-key" # 
messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# falcon call
response = completion(model="falcon-40b", messages=messages)

Hi @ishaan-jaff ! Thanks for the suggestion; however this particular issue is related to experimenting with simple, light-weight model hosting via the front end.

When it comes to hosted or cloud-based inferencing, we've created a simple FastAPI server for serving the eventual set of fine-tuned Opera LLM models.

For more context, we use custom model checkpoints (like the ones seen in our HuggingFace repo) that aren't super standard, and then we fine tune those further into custom models for specific tasks. We store the resulting weights and configs locally, as the file size and inferencing speed is low enough for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed javascript Pull requests that update Javascript code
Projects
Status: 📋 Backlog
Development

No branches or pull requests

5 participants