We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I was wondering what would be the best way to split the execution of flexgen to prefill and decode only.
How should I save the values from prefill and how should I load them when I am running flexgen again for decode only.
Thanks
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I was wondering what would be the best way to split the execution of flexgen to prefill and decode only.
How should I save the values from prefill and how should I load them when I am running flexgen again for decode only.
Thanks
The text was updated successfully, but these errors were encountered: