"Motion to Dance Music Generation using Latent Diffusion Model" - Official PyTorch Implementation
This code was tested on Ubuntu 20.04.2 LTS
and requires:
- Python 3.8
- CUDA capable GPU
- Download Pre-processed data and models
pip install -r requirements.txt
The dataset used was the AIST++ dataset. The segmented music data is also provided here.
python audio_to_images.py
python norm_motion.py
python train_unet_latent.py
python eval_cdcd.py --gen_audio=True
python post_process.py
python eval_cdcd.py # beat coverage score, beat hit score, and FAD
python bas_cdcd.py # beat align score
python genre.py # genre KLD (get pretrained model from https://github.com/PeiChunChang/MS-SincResNet)
Please include the following citations in any preprints and publications that use this repository.
@inproceedings{10.1145/3610543.3626164,
author = {Tan, Vanessa and Nam, Junghyun and Nam, Juhan and Noh, Junyong},
title = {Motion to Dance Music Generation Using Latent Diffusion Model},
year = {2023},
isbn = {9798400703140},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3610543.3626164},
doi = {10.1145/3610543.3626164},
booktitle = {SIGGRAPH Asia 2023 Technical Communications},
articleno = {5},
numpages = {4},
keywords = {latent diffusion model, 3D motion to music, music generation},
location = {, Sydney, NSW, Australia, },
series = {SA Technical Communications '23}
}
We would like to thank Joel Casimiro for helping in creating our preview image.
We would also like to thank the following contributors that our code is based on: Audio-Diffusion, EDGE, Bailando, AIST++, MS-SincResNet.