Skip to content

Latest commit

 

History

History
94 lines (82 loc) · 5.38 KB

README.md

File metadata and controls

94 lines (82 loc) · 5.38 KB

Compose & Embellish: A Transformer-based Piano Generation System

Official PyTorch implementation of the paper:

  • Shih-Lun Wu and Yi-Hsuan Yang
    Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach
    Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2023
    Paper | Audio demo (Google Drive) | Model weights image

Changelog

  • [24-07-29] Added transformers GPT-2 implementation for Embellish model, which doesn't depend on fast-transformers package.

Prerequisites

  • Python 3.8 and CUDA 10.2 recommended
  • Install dependencies
    pip install -r requirements.txt
    
    # Optional, only required if you are using the config `stage02_embellish/config/pop1k7_default.yaml`
    pip install git+https://github.com/cifkao/fast-transformers.git@39e726864d1a279c9719d33a95868a4ea2fb5ac5
    
  • Download trained models from HuggingFace Hub (make sure you're in repository root directory)
    git clone https://huggingface.co/slseanwu/compose-and-embellish-pop1k7
    

Generate piano performances (with our trained models)

  • Stage 1: generate lead sheets (i.e., melody + chord progression)

    python3 stage01_compose/inference.py \
      stage01_compose/config/pop1k7_finetune.yaml \
      generation/stage01 \
      20
    

    You'll have 20 lead sheets under generation/stage01 after this step.

  • Stage 2: generate full performances conditioned on Stage 1 lead sheets, using GPT-2 backbone

    python3 stage02_embellish/inference_gpt2.py \
      stage02_embellish/config/pop1k7_gpt2.yaml \
      generation/stage01 \
      generation/stage02_gpt2
    

    The samp_**_2stage_samp**.mid files under generation/stage02 are the final results.

    • (Optional) Use Performer (from fast-transformers) backbone for stage 2 instead
      python3 stage02_embellish/inference.py \
        stage02_embellish/config/pop1k7_default.yaml \
        generation/stage01 \
        generation/stage02_performer
      

Train (finetune) models on AILabs.tw Pop1K7 dataset

  • Stage 1: lead sheet (i.e. "Compose") model

    python3 stage01_compose/train.py stage01_compose/config/pop1k7_finetune.yaml
    
  • Stage 2: performance (i.e. "Embellish") model w/ GPT-2 backbone

    python3 stage02_embellish/train.py stage02_embellish/config/pop1k7_gpt2.yaml
    
    • (Optional) use Performer backbone instead, which allows longer context window
      python3 stage02_embellish/train.py stage02_embellish/config/pop1k7_default.yaml
      

Note that these two training stages may be run in parallel.

Train on custom datasets

If you'd like to experiment with your own datasets, we suggest that you

Acknowledgements

We would like to thank the following people for their open-source implementations that paved the way for our work:

BibTex

If this repo helps with your research, please consider citing:

@inproceedings{wu2023compembellish,
  title={{Compose \& Embellish}: Well-Structured Piano Performance Generation via A Two-Stage Approach},
  author={Wu, Shih-Lun and Yang, Yi-Hsuan},
  booktitle={Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2023},
  url={https://arxiv.org/pdf/2209.08212.pdf}
}