CVPR 2024 | PIA：Personalized Image Animator

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Yiming Zhang*, Zhening Xing*, Yanhong Zeng†, Youqing Fang, Kai Chen†

(*equal contribution, †corresponding Author)

PIA is a personalized image animation method which can generate videos with high motion controllability and strong text and image alignment.

If you find our project helpful, please give it a star ⭐ or cite it, we would be very grateful 💖 .

What's New

2024/01/03 Replicate Demo & API support!
2024/01/03 Colab support from camenduru!
2023/12/28 Support scaled_dot_product_attention for 1024x1024 images with just 16GB of GPU memory.
2023/12/25 HuggingFace demo is available now! 🤗 Hub
2023/12/22 Release the demo of PIA on OpenXLab and checkpoints on Google Drive or

Setup

Prepare Environment

Use the following command to install a conda environment for PIA from scratch:

conda env create -f pia.yml
conda activate pia

You may also want to install it based on an existing environment, then you can use environment-pt2.yaml for Pytorch==2.0.0. If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:

conda env create -f environment.yaml
conda activate pia

We strongly recommend you to use Pytorch==2.0.0 which supports scaled_dot_product_attention for memory-efficient image animation.

Download checkpoints

Download the Stable Diffusion v1-5

conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/

Download PIA

git clone https://huggingface.co/Leoxing/PIA models/PIA/

Download Personalized Models

bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh

You can also download pia.ckpt manually through link on Google Drive or HuggingFace.

Put checkpoints as follows:

└── models
    ├── DreamBooth_LoRA
    │   ├── ...
    ├── PIA
    │   ├── pia.ckpt
    └── StableDiffusion
        ├── vae
        ├── unet
        └── ...

Inference

Image Animation

Image to Video result can be obtained by:

python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml

Run the command above, then you can find the results in example/result:

Input Image	lightning, lighthouse	sun rising, lighthouse	fireworks, lighthouse

Input Image	1boy smiling	1boy playing the magic fire	1boy is waving hands

Input Image	1girl is smiling	1girl is crying	1girl, snowing

Motion Magnitude

You can control the motion magnitude through the parameter magnitude:

python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion

Examples:

python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml

Input Image & Prompt	Small Motion	Moderate Motion	Large Motion
a golden labrador is running
1bear is walking, ...
cherry blossom, ...

Style Transfer

To achieve style transfer, you can run the command(Please don't forget set the base model in xxx.yaml):

Examples:

python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer

Input Image & Base Model	1man is smiling	1man is crying	1man is singing
Realistic Vision
RCNZ Cartoon 3d
	1girl smiling	1girl open mouth	1girl is crying, pout
RCNZ Cartoon 3d

Loop Video

You can generate loop by using the parameter --loop

python inference.py --config=example/config/xxx.yaml --loop

Examples:

python inference.py --config=example/config/lighthouse.yaml --loop
python inference.py --config=example/config/labrador.yaml --loop

Input Image	lightning, lighthouse	sun rising, lighthouse	fireworks, lighthouse

Input Image	labrador jumping	labrador walking	labrador running

Training

We provide training script for PIA. It borrows from AnimateDiff heavily, so please prepare the dataset and configuration files according to the guideline.

After preparation, you can train the model by running the following command using torchrun:

torchrun --nnodes=1 --nproc_per_node=1 train.py --config example/config/train.yaml

or by slurm,

srun --quotatype=reserved --job-name=pia --gres=gpu:8 --ntasks-per-node=8 --ntasks=8  --cpus-per-task=4 --kill-on-bad-exit=1 python train.py --config example/config/train.yaml

AnimateBench

We have open-sourced AnimateBench on HuggingFace which includes images, prompts and configs to evaluate PIA and other image animation methods.

BibTex

@inproceedings{zhang2024pia,
  title={Pia: Your personalized image animator via plug-and-play modules in text-to-image models},
  author={Zhang, Yiming and Xing, Zhening and Zeng, Yanhong and Fang, Youqing and Chen, Kai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7747--7756},
  year={2024}
}

Contact Us

Yiming Zhang: zhangyiming@pjlab.org.cn

Zhening Xing: xingzhening@pjlab.org.cn

Yanhong Zeng: zengyh1900@gmail.com

Acknowledgements

The code is built upon AnimateDiff, Tune-a-Video and PySceneDetect

You may also want to try other project from our team:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CVPR 2024 | PIA：Personalized Image Animator

What's New

Setup

Prepare Environment

Download checkpoints

Inference

Image Animation

Motion Magnitude

Style Transfer

Loop Video

Training

AnimateBench

BibTex

Contact Us

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

CVPR 2024 | PIA：Personalized Image Animator

What's New

Setup

Prepare Environment

Download checkpoints

Inference

Image Animation

Motion Magnitude

Style Transfer

Loop Video

Training

AnimateBench

BibTex

Contact Us

Acknowledgements