-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is textencoder Quantized Diffusers models can be saved and loaded? #264
Comments
They can be. Follow this: https://github.com/huggingface/optimum-quanto?tab=readme-ov-file#llm-models For the unimplemented classes, you can just refer to
And send us a PR. |
how PixArt-alpha/PixArt-Sigma-XL-2-1024-MS model use save and load text Quantized model |
This is shown in the guide |
Also, @sayakpaul, is there a way to load models without entirely requantizing them? I'm trying to load flux from a quantized save (made a class), but it takes a while due to the requantization (I think), to the point where there's no real reason to not just load normally and quantize on the fly |
Will let @dacorvo comment on that. But he is on vacation so expect delay. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
@Ednaordinary some recent pull-requests might have fixed the long requantization times (see #290). |
@dacorvo Nice! Something along the way must have also changed how the first move between devices works because I'm seeing a lot faster transfer between CPU and GPU on the first move. Thanks for your hard work! |
This must be #291.
Kudos to @latentCall145 ! |
No description provided.
The text was updated successfully, but these errors were encountered: