Compose & Embellish: A Transformer-based Piano Generation System

Official PyTorch implementation of the paper:

Shih-Lun Wu and Yi-Hsuan Yang
Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach
Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2023
Paper | Audio demo (Google Drive) | Model weights

Changelog

[24-07-29] Added transformers GPT-2 implementation for Embellish model, which doesn't depend on fast-transformers package.

Prerequisites

Python 3.8 and CUDA 10.2 recommended

Install dependencies

pip install -r requirements.txt

# Optional, only required if you are using the config `stage02_embellish/config/pop1k7_default.yaml`
pip install git+https://github.com/cifkao/fast-transformers.git@39e726864d1a279c9719d33a95868a4ea2fb5ac5

Download trained models from HuggingFace Hub (make sure you're in repository root directory)
```
git clone https://huggingface.co/slseanwu/compose-and-embellish-pop1k7
```

Generate piano performances (with our trained models)

Stage 1: generate lead sheets (i.e., melody + chord progression)
```
python3 stage01_compose/inference.py \
  stage01_compose/config/pop1k7_finetune.yaml \
  generation/stage01 \
  20
```
You'll have 20 lead sheets under generation/stage01 after this step.

Stage 2: generate full performances conditioned on Stage 1 lead sheets, using GPT-2 backbone

python3 stage02_embellish/inference_gpt2.py \
  stage02_embellish/config/pop1k7_gpt2.yaml \
  generation/stage01 \
  generation/stage02_gpt2

The samp_**_2stage_samp**.mid files under generation/stage02 are the final results.

(Optional) Use Performer (from fast-transformers) backbone for stage 2 instead

python3 stage02_embellish/inference.py \
  stage02_embellish/config/pop1k7_default.yaml \
  generation/stage01 \
  generation/stage02_performer

Train (finetune) models on AILabs.tw Pop1K7 dataset

Stage 1: lead sheet (i.e. "Compose") model

python3 stage01_compose/train.py stage01_compose/config/pop1k7_finetune.yaml

Stage 2: performance (i.e. "Embellish") model w/ GPT-2 backbone

python3 stage02_embellish/train.py stage02_embellish/config/pop1k7_gpt2.yaml

(Optional) use Performer backbone instead, which allows longer context window

python3 stage02_embellish/train.py stage02_embellish/config/pop1k7_default.yaml

Note that these two training stages may be run in parallel.

Train on custom datasets

If you'd like to experiment with your own datasets, we suggest that you

read our dataloaders (stage 1, stage 2) and .pkl files of our processed datasets (stage 1, stage 2) to understand what the models receive as inputs
refer to CP Transformer repo for a general guide on converting audio/MIDI files to event-based representations
use musical structure analyzer to get required structure markings for our stage 1 models.

Acknowledgements

We would like to thank the following people for their open-source implementations that paved the way for our work:

Performer (fast-transformers): Angelos Katharopoulos (@angeloskath) and Ondřej Cífka (@cifkao)
Transformer w/ relative positional encoding: Zhilin Yang (@kimiyoung)
Musical structure analysis: Shuqi Dai (@Dsqvival)
LakhMIDI melody identification: Thomas Melistas (@gulnazaki)
Skyline melody extraction: Wen-Yi Hsiao (@wayne391) and Yi-Hui Chou (@sophia1488)

BibTex

If this repo helps with your research, please consider citing:

@inproceedings{wu2023compembellish,
  title={{Compose \& Embellish}: Well-Structured Piano Performance Generation via A Two-Stage Approach},
  author={Wu, Shih-Lun and Yang, Yi-Hsuan},
  booktitle={Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2023},
  url={https://arxiv.org/pdf/2209.08212.pdf}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Compose & Embellish: A Transformer-based Piano Generation System

Changelog

Prerequisites

Generate piano performances (with our trained models)

Train (finetune) models on AILabs.tw Pop1K7 dataset

Train on custom datasets

Acknowledgements

BibTex

Files

README.md

Latest commit

History

README.md

File metadata and controls

Compose & Embellish: A Transformer-based Piano Generation System

Changelog

Prerequisites

Generate piano performances (with our trained models)

Train (finetune) models on AILabs.tw Pop1K7 dataset

Train on custom datasets

Acknowledgements

BibTex