Make preprocessing fully differentiable with torch API #4

HudsonHuang · 2020-12-16T12:08:58Z

I appreciate your efforts, nice work.
But your audio_toolkit was implement in librosa and numpy, which was not differentiable.
It might limited the application. Eg. If I have an TTS model to generated Mel spectrogram, and if your dvector if fully differentiable, we can use this like a discriminator, to force the TTS model output exactly as expected person.
From waveform to Melspectrogram, you can make preprocessing fully differentiable with torchaudio, and it seems it can keep consitency with librosa

yistLin · 2020-12-16T14:27:20Z

Hi, thanks for your suggestion. I'm actually considering ditching librosa for torchaudio especially after I chose to do silence trimming with sox instead of webrtcvad.

Since I'd like to make the preprocessing modules as simple as possible (import less packages as possible), I probably need some time to study the usage of sox effects in the most recent version of torchaudio.

yistLin · 2021-01-24T12:36:49Z

I've developed completely new preprocessing toolkits which use torchaudio, can be compiled with TorchScript and be used anywhere without any dependencies.

yistLin closed this as completed Jan 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make preprocessing fully differentiable with torch API #4

Make preprocessing fully differentiable with torch API #4

HudsonHuang commented Dec 16, 2020 •

edited

Loading

yistLin commented Dec 16, 2020

yistLin commented Jan 24, 2021

Make preprocessing fully differentiable with torch API #4

Make preprocessing fully differentiable with torch API #4

Comments

HudsonHuang commented Dec 16, 2020 • edited Loading

yistLin commented Dec 16, 2020

yistLin commented Jan 24, 2021

HudsonHuang commented Dec 16, 2020 •

edited

Loading