a) Download and extract the LJ Speech dataset, then create a link to the dataset folder: ln -s /xxx/LJSpeech-1.1/ data/raw/
b) Download and Unzip the ground-truth duration extracted by MFA: tar -xvf mfa_outputs.tar; mv mfa_outputs data/processed/ljspeech/
c) Run the following scripts to pack the dataset for training/inference.
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES=0 python data_gen/tts/bin/binarize.py --config configs/tts/lj/fs2.yaml
# `data/binary/ljspeech` will be generated.
We provide the pre-trained model of HifiGAN vocoder.
Please unzip this file into checkpoints
before training your acoustic model.
CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/lj_ds_pndm.yaml --exp_name ds_pndm_lj_1 --reset
CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/lj_ds_pndm.yaml --exp_name ds_pndm_lj_1 --reset --infer