FastSpeech 2 - EmoFS2

This is the FastSPeech 2 model used to finish a thesis. It is based on the implementation of ming024.

Use

In line with the forked GitHub, this FastSpeech 2 model works on the LibriTTS workflow. Instead of multiple speakers, multiple emotions are used to train he model on.

Quickstart

Dependencies

You can install the Python dependencies with

pip3 install -r requirements.txt

Inference

After training, run

python3 synthesize.py --text "YOUR_DESIRED_TEXT" --speaker_id SPEAKER_ID --restore_step 100000 --mode single -p config/EmoFS2/preprocess.yaml -m config/EmoFS2/model.yaml -t config/EmoFS2/train.yaml

Different speaker_id's are:

0 -> neutral

1 -> happy

2 -> angry

3 -> sad

4 -> surprise

The generated utterances will be put in output/result/.

Training

Datasets

The used dataset is

ESD: multiple speakers, two languages, 5 emotional states; each speaker dataset consists of the same 350 short audio clips per emotion.

Only speaker 14 is used.

Preprocessing

First, run

python3 prepare_align.py config/EmoFS2/preprocess.yaml

After that, run the preprocessing script by

python3 preprocess.py config/EmoFS2/preprocess.yaml

to align the corpus and then run the preprocessing script.

python3 preprocess.py config/EmoFS2/preprocess.yaml

Training

Train your model with

python3 train.py -p config/EmoFS2/preprocess.yaml -m config/EmoFS2/model.yaml -t config/EmoFS2/train.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
audio		audio
config/EmoFS2		config/EmoFS2
demo/EDM_Sp14		demo/EDM_Sp14
hifigan		hifigan
img		img
lexicon		lexicon
model		model
preprocessed_data		preprocessed_data
preprocessor		preprocessor
text		text
transformer		transformer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
index.html		index.html
prepare_align.py		prepare_align.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
synthesize.py		synthesize.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastSpeech 2 - EmoFS2

Use

Quickstart

Dependencies

Inference

Training

Datasets

Preprocessing

Training

About

Releases

Packages

Languages

License

FenNlay/FastSpeech2

Folders and files

Latest commit

History

Repository files navigation

FastSpeech 2 - EmoFS2

Use

Quickstart

Dependencies

Inference

Training

Datasets

Preprocessing

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages