Release v2.4.0: Rectified Flow algorithm and new feature extractor based on harmonic-noise separation model · openvpi/DiffSinger

New generative model algorithm: Rectified Flow (#184)

Rectified Flow is a new ODE-based generative model algorithm which is introduced in this paper and used in Stable Diffusion 3. The experimental results has shown that Rectified Flow outperforms the former DDPM in all modules of DiffSinger. This should be the first publicly known usage of Rectified Flow in SVS systems.

Rectified Flow has already been the default algorithm to train a new DiffSinger model. No actions are required if you are using the template configuration file. Though not recommended, you can turn back to DDPM with the following line in your configuration:

diffusion_type: 'ddpm'  # default value is 'reflow'

Feature extractor based on harmonic-noise separation model (#196)

Harmonic-noise separation is a fundamental step to extract breathiness, voicing and tension from singing voice. The old WORLD-based method is unable to separate harmonic and noise clearly, making the extracted features not as accurate as expected. We introduced a new NN-based algorithm (Vocal Remover) for this separation process. With the new method, the performance of most variance parameters (especially tension) should improve.

The new harmonic-noise separator has already been the default choice for preprocessing new datasets. Please read the guidance in GettingStarted.md and download the model file. Though not recommended, you can still use WORLD with the following line in your configuration:

hnsep: world  # default value is 'vr'

Other improvements, changes and bug fixes

The --speedup option in infer.py is replaced by --steps for continuous acceleration of Rectified Flow
All exported models are adapted to the new continuous acceleration API
Mel log base migration: log10 setting is banned in preprocessing
Mel log base migration: all exported models are converted to accept log e mel spectrograms
The trainer now shows an error message when user sets all predict_* to false in variance model training
The binarizer now shows an error message when negative values are found in ph_dur or note_dur
Package versions in requirements.txt are updated; ONNX exporting requirements are written in requirements-onnx.txt
Bugfix: the extracted tension can be incorrect if the recording and label are not aligned

Some changes may not be listed above. See full change log: v2.3.0...v2.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.4.0: Rectified Flow algorithm and new feature extractor based on harmonic-noise separation model

New generative model algorithm: Rectified Flow (#184)

Feature extractor based on harmonic-noise separation model (#196)

Other improvements, changes and bug fixes