You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'll be adding some context given our Slack convo:
First of all, few-shot prompting takes so much times, it even reached a range from 60 to 90 seconds response time. That happend because of the number of tokens we use everytime we send the history, it can be improved if we only give the history once.
Fine-tuning models seems quite good. Prompts were as good as few-shot prompting but in a more reasonable time. You can have your response in <1 seconds, one of the biggest problems i could think of was that when the target was pretty close, the delay between prompts needs to be shortened due to a bigger need for more dynamic responses.
My thought right now is that, in the approach of replicating human behaviour, RAG (customGPTs) or prompt engineering themselves, without fine tuning, are not going to get us anywhere here because of the latency, especially RAG.
In other paths different than human behaviour cloning, such as the one that @DumplingLife is purusing of trying to forecast the future state of each object and make a plan based on that, these approaches could be more useful, despite being slower.
Anyway, as they say in this (fantastic talk)[https://www.youtube.com/watch?v=ahnGLM-RC1Y&t=1429s] by OpenAI to optimize LLMs, the best approach to explore LLM optimal usage is not exclusively one path (RAG, prompt engineering, fine tuning), but it can include multiple aspects.
So the question, related to this issue is: Are fine-tuned models with the arclab business account slower than fine-tuned models in a personal account? This is extremely relevant
Do it for different models
The text was updated successfully, but these errors were encountered: