Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check how the API response time varies now between business and personal accounts #23

Open
vrodriguezf opened this issue Nov 15, 2023 · 3 comments

Comments

@vrodriguezf
Copy link
Contributor

Do it for different models

@OhhTuRnz
Copy link
Collaborator

I'll be adding some context given our Slack convo:

  • First of all, few-shot prompting takes so much times, it even reached a range from 60 to 90 seconds response time. That happend because of the number of tokens we use everytime we send the history, it can be improved if we only give the history once.
  • Fine-tuning models seems quite good. Prompts were as good as few-shot prompting but in a more reasonable time. You can have your response in <1 seconds, one of the biggest problems i could think of was that when the target was pretty close, the delay between prompts needs to be shortened due to a bigger need for more dynamic responses.

@vrodriguezf
Copy link
Contributor Author

My thought right now is that, in the approach of replicating human behaviour, RAG (customGPTs) or prompt engineering themselves, without fine tuning, are not going to get us anywhere here because of the latency, especially RAG.

In other paths different than human behaviour cloning, such as the one that @DumplingLife is purusing of trying to forecast the future state of each object and make a plan based on that, these approaches could be more useful, despite being slower.

Anyway, as they say in this (fantastic talk)[https://www.youtube.com/watch?v=ahnGLM-RC1Y&t=1429s] by OpenAI to optimize LLMs, the best approach to explore LLM optimal usage is not exclusively one path (RAG, prompt engineering, fine tuning), but it can include multiple aspects.
image

@vrodriguezf
Copy link
Contributor Author

So the question, related to this issue is: Are fine-tuned models with the arclab business account slower than fine-tuned models in a personal account? This is extremely relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants