ReAtCo

Re-Attentional Controllable Video Diffusion Editing, AAAI 2025

This work proposes a new text-guided video editing framework focusing on controllable video generation and editing, with a particular emphasis on the controllability of the spatial location of multiple foreground objects.

Video Demos


[Source Video]: "Two dolphins are swimming in the blue ocean."	"A jellyfish and a goldfish are swimming in the blue ocean, with the jellyfish is to the left of the goldfish."	"A turtle and a goldfish are swimming in the blue ocean, with the turtle is to the left of the goldfish."	"A jellyfish and a octopus are swimming in the blue ocean, with the jellyfish is to the left of the octopus."


[Source Video]: "Two hares are grazing in the grass."	"A swan and a hare are grazing in the grass, with the swan is to the left of the hare."	"A cat and a swan are grazing in the grass, with the cat is to the left of the swan."	"A cat and a swan are grazing in the yellow meadow, with the cat is to the left of the swan."

Overview Framework of ReAtCo

The main idea of ReAtCo is to refocus the cross-attention activation responses between the edited text prompt and the target video during the denoising stage, resulting in a spatially location-aligned and semantically high-fidelity manipulated video. More details can be found in our paper.

Usage

We now introduce how to run our codes and edit the controllable and desired target videos.

1. Requirements

We use the classic Tune-A-Video as the pretrained base video editing model so that the Requirements can follow Tune-A-Video's publicly available codes. Note: Due to the latest xformers requiring PyTorch 2.5.1, we have tested our codes on the latest version with the V100 GPU, and the full environment is reported in environment.txt

2. Pretrained Video Editing Model

Before obtaining the Tune-A-Video editing model, you need to download the pretrained Stable Diffusion v1-4 model, which should be placed in the ./checkpoints. Then run the following command:

accelerate launch train_tuneavideo.py --config=configs/dolphins-swimming.yaml

And, the pretrained video editing models are saved in ./tune_a_video_model.

3. ReAtCo Video Editing

Generating video latents with the following command:

python generation_video_latents.py

Editing videos with the following command:

python reatco_editing_dolphins-swimming.py

The edited videos are saved in ./edited_videos.

Note: In the script above, the default setting is the Resource-friendly ReAtCo Paradigm, which ensures that ReAtCo can edit videos on a consumer-grade GPU (e.g. RTX 4090/3090). More details can be found in the Appendix of our paper. In particular, we set the window_size=4 as default, which is compatible with RTX 4090/3090 GPU. If you have sufficient GPU resources and do not want to use the resource-friendly paradigm, please set window_size=video_length.

Citation

If you find the codes helpful in your research or work, please cite the following paper:

@article{ReAtCo,
  title={Re-Attentional Controllable Video Diffusion Editing},
  author={Wang, Yuanzhi and Li, Yong and Liu, Mengyi and Zhang, Xiaoya and Liu, Xin and Cui, Zhen and Chan, Antoni B.},
  journal={arXiv preprint arXiv:2412.11710},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
checkpoints		checkpoints
configs		configs
data		data
edited_videos		edited_videos
tuneavideo		tuneavideo
video_demo		video_demo
LICENSE		LICENSE
README.md		README.md
environment.txt		environment.txt
framework.jpg		framework.jpg
generation_video_latents.py		generation_video_latents.py
ptp_utils.py		ptp_utils.py
reatco_editing_brids-standing.py		reatco_editing_brids-standing.py
reatco_editing_dolphins-swimming.py		reatco_editing_dolphins-swimming.py
reatco_editing_hares-grazing.py		reatco_editing_hares-grazing.py
reatco_editing_hares-grazing_background_change.py		reatco_editing_hares-grazing_background_change.py
reatco_editing_hares-grazing_partial.py		reatco_editing_hares-grazing_partial.py
train_tuneavideo.py		train_tuneavideo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReAtCo

Re-Attentional Controllable Video Diffusion Editing, AAAI 2025

Video Demos

Overview Framework of ReAtCo

Usage

1. Requirements

2. Pretrained Video Editing Model

3. ReAtCo Video Editing

Citation

About

Releases

Packages

Languages

License

mdswyz/ReAtCo

Folders and files

Latest commit

History

Repository files navigation

ReAtCo

Re-Attentional Controllable Video Diffusion Editing, AAAI 2025

Video Demos

Overview Framework of ReAtCo

Usage

1. Requirements

2. Pretrained Video Editing Model

3. ReAtCo Video Editing

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages