support 4K images (#135)

* update diffusers convert script for 2K, 4K models; Signed-off-by: lawrence-cj <[email protected]> * update README.md and model_zoo.md; Signed-off-by: lawrence-cj <[email protected]> * add 4096 config files; Signed-off-by: lawrence-cj <[email protected]> * 1. add 4096 aspect ratio; 2. pre-commit Signed-off-by: lawrence-cj <[email protected]> * add 4K sana ComfyUI related files; Signed-off-by: lawrence-cj <[email protected]> * update comfyui.md Signed-off-by: lawrence-cj <[email protected]> * pre-commit; Signed-off-by: lawrence-cj <[email protected]> --------- Signed-off-by: lawrence-cj <[email protected]>
NVlabs · Jan 8, 2025 · 2e0f047 · 2e0f047
1 parent 1ca56ea
commit 2e0f047
Show file tree

Hide file tree

Showing 12 changed files with 711 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -36,6 +36,8 @@ As a result, Sana-0.6B is very competitive with modern giant diffusion model (e.
 
 ## 🔥🔥 News
 
+- (🔥 New) \[2025/1/8\] 4K resolution [Sana models](asset/docs/model_zoo.md) is supported in [Sana-ComfyUI](https://github.com/Efficient-Large-Model/ComfyUI_ExtraModels) and [work flow](asset/docs/ComfyUI/Sana_FlowEuler_4K.json) is also prepared. [\[4K guidance\]](asset/docs/ComfyUI/comfyui.md)
+- (🔥 New) \[2025/1/8\] 1.6B 4K resolution [Sana models](asset/docs/model_zoo.md) are released: [\[BF16 pth\]](https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16) or [\[BF16 diffusers\]](https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16_diffusers). 🚀 Get your 4096x4096 resolution images within 20 seconds! Find more samples in [Sana page](https://nvlabs.github.io/Sana/). Thanks [SUPIR](https://github.com/Fanghua-Yu/SUPIR) for their wonderful work and support.
 - (🔥 New) \[2025/1/2\] Bug in the `diffusers` pipeline is solved. [Solved PR](https://github.com/huggingface/diffusers/pull/10431)
 - (🔥 New) \[2025/1/2\] 2K resolution [Sana models](asset/docs/model_zoo.md) is supported in [Sana-ComfyUI](https://github.com/Efficient-Large-Model/ComfyUI_ExtraModels) and [work flow](asset/docs/ComfyUI/Sana_FlowEuler_2K.json) is also prepared.
 - (🔥 New) \[2024/12/20\] 1.6B 2K resolution [Sana models](asset/docs/model_zoo.md) are released: [\[BF16 pth\]](https://huggingface.co/Efficient-Large-Model/Sana_1600M_2Kpx_BF16) or [\[BF16 diffusers\]](https://huggingface.co/Efficient-Large-Model/Sana_1600M_2Kpx_BF16_diffusers). 🚀 Get your 2K resolution images within 4 seconds! Find more samples in [Sana page](https://nvlabs.github.io/Sana/). Thanks [SUPIR](https://github.com/Fanghua-Yu/SUPIR) for their wonderful work and support.
@@ -325,7 +327,7 @@ We will try our best to release
 - \[✅\] DC-AE Diffusers
 - \[✅\] Sana merged in Diffusers(https://github.com/huggingface/diffusers/pull/9982)
 - \[✅\] LoRA training by [@paul](https://github.com/sayakpaul)(`diffusers`: https://github.com/huggingface/diffusers/pull/10234)
-- \[✅\] 2K resolution models.(Thanks [@SUPIR](https://github.com/Fanghua-Yu/SUPIR) to provide a 4K super-resolution model)
+- \[✅\] 2K/4K resolution models.(Thanks [@SUPIR](https://github.com/Fanghua-Yu/SUPIR) to provide a 4K super-resolution model)
 - \[💻\] ControlNet (train & inference & models)
 - \[💻\] 8bit / 4bit Laptop development
 - \[💻\] Larger model size

diff --git a/app/sana_pipeline.py b/app/sana_pipeline.py
@@ -26,7 +26,12 @@
 
 
 from diffusion import DPMS, FlowEuler
-from diffusion.data.datasets.utils import ASPECT_RATIO_512_TEST, ASPECT_RATIO_1024_TEST, ASPECT_RATIO_2048_TEST
+from diffusion.data.datasets.utils import (
+    ASPECT_RATIO_512_TEST,
+    ASPECT_RATIO_1024_TEST,
+    ASPECT_RATIO_2048_TEST,
+    ASPECT_RATIO_4096_TEST,
+)
 from diffusion.model.builder import build_model, get_tokenizer_and_text_encoder, get_vae, vae_decode
 from diffusion.model.utils import get_weight_dtype, prepare_prompt_ar, resize_and_crop_tensor
 from diffusion.utils.config import SanaConfig, model_init_config