Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis
Jiapeng Zhu*, Ceyuan Yang*, Kecheng Zheng, Yinghao Xu, Zifan Shi, Yujun Shen
arXiv preprint arXiv:2309.03904
[Paper]
- Release inference code
- Release text-to-image generator at 64x64 resolution
- Release models at higher resolution
- Release training code
- Release plug-ins/efficient algorithms for more functionalities
This repository is developed based on Hammer, where you can find more detailed instructions on installation. Here, we summarize the necessary steps to facilitate reproduction.
-
Environment: CUDA version == 11.3.
-
Install package requirements with
conda
:conda create -n aurora python=3.8 # create virtual environment with Python 3.8 conda activate aurora pip install -r requirements/minimal.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
First, please download the pre-trained model here.
To synthesize an image with given text prompt, you can use the following command
python run_synthesize.py aurora_v1.pth 'A photo of a tree with autumn leaves'
To make interpolation between two text prompts, you can use the following command
python run_interpolate.py aurora_v1.pth \
--src_prompt 'A photo of a tree with autumn leaves' \
--dst_prompt 'A photo of a victorian house'
- Text-conditioned image generation
- Text prompt interpolation
The project is under MIT License, and is for research purpose ONLY.
We highly appreciate StyleGAN2, StyleGAN3, CLIP, and Hammer for their contributions to the community.
@article{zhu2023aurora,
title = {Exploring Sparse {MoE} in {GANs} for Text-conditioned Image Synthesis},
author = {Zhu, Jiapeng and Yang, Ceyuan and Zheng, Kecheng and Xu, Yinghao and Shi, Zifan and Shen, Yujun},
journal = {arXiv preprint arXiv:2309.03904},
year = {2023}
}