Motion to Dance Music Generation

"Motion to Dance Music Generation using Latent Diffusion Model" - Official PyTorch Implementation

Installation

This code was tested on Ubuntu 20.04.2 LTS and requires:

Python 3.8
CUDA capable GPU
Download Pre-processed data and models

pip install -r requirements.txt

Dataset

The dataset used was the AIST++ dataset. The segmented music data is also provided here.

Preprocess data

Generate mel spectrograms

python audio_to_images.py

Generate concatenated motion and genre features

python norm_motion.py

Training and inference

Train latent diffusion model using pre-trained VAE

python train_unet_latent.py

Generate samples then normalize loudness

python eval_cdcd.py --gen_audio=True
python post_process.py

Evaluation

python eval_cdcd.py # beat coverage score, beat hit score, and FAD
python bas_cdcd.py  # beat align score
python genre.py     # genre KLD (get pretrained model from https://github.com/PeiChunChang/MS-SincResNet)

Attribution

Please include the following citations in any preprints and publications that use this repository.

@inproceedings{10.1145/3610543.3626164,
author = {Tan, Vanessa and Nam, Junghyun and Nam, Juhan and Noh, Junyong},
title = {Motion to Dance Music Generation Using Latent Diffusion Model},
year = {2023},
isbn = {9798400703140},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3610543.3626164},
doi = {10.1145/3610543.3626164},
booktitle = {SIGGRAPH Asia 2023 Technical Communications},
articleno = {5},
numpages = {4},
keywords = {latent diffusion model, 3D motion to music, music generation},
location = {, Sydney, NSW, Australia, },
series = {SA Technical Communications '23}
}

Acknowledgments

We would like to thank Joel Casimiro for helping in creating our preview image.
We would also like to thank the following contributors that our code is based on: Audio-Diffusion, EDGE, Bailando, AIST++, MS-SincResNet.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
genremodel		genremodel
utils		utils
.gitignore		.gitignore
CDCD_aist.txt		CDCD_aist.txt
LICENSE		LICENSE
README.md		README.md
audio_to_images.py		audio_to_images.py
bas_cdcd.py		bas_cdcd.py
eval_cdcd.py		eval_cdcd.py
genre.py		genre.py
norm_motion.py		norm_motion.py
post_process.py		post_process.py
requirements.txt		requirements.txt
train_unet_latent.py		train_unet_latent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motion to Dance Music Generation

Installation

Dataset

Preprocess data

Generate mel spectrograms

Generate concatenated motion and genre features

Training and inference

Train latent diffusion model using pre-trained VAE

Generate samples then normalize loudness

Evaluation

Attribution

Acknowledgments

About

Releases

Packages

Languages

License

vtan05/dmd

Folders and files

Latest commit

History

Repository files navigation

Motion to Dance Music Generation

Installation

Dataset

Preprocess data

Generate mel spectrograms

Generate concatenated motion and genre features

Training and inference

Train latent diffusion model using pre-trained VAE

Generate samples then normalize loudness

Evaluation

Attribution

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages