The audio which sythesize by feature file(.f32) is bad ? #6

lmingde · 2019-08-20T09:39:13Z

I extract feature file(.f32) by modify /Tacotron-2/preprocessor.py(link)

# I change the preprocessor.py line 133:
    feature_name = 'feature-{}.f32'.format(wavfile)
    mel_filename = 'mel-{}.npy'.format(wavfile)
    linear_filename = 'linear-{}.npy'.format(wavfile)

    mel_spectrogram = mel_spectrogram.T
    np.save(os.path.join(mel_dir, mel_filename), mel_spectrogram, allow_pickle=False)
    mel_spectrogram = mel_spectrogram.reshape((-1,))
    mel_spectrogram.tofile(os.path.join(feature_dir,feature_name))
    np.save(os.path.join(linear_dir, linear_filename), linear_spectrogram.T, allow_pickle=False)

and I use the commands:

make test_lpcnet taco=1 # Define TACOTRON2 macro
./test_lpcnet test_features.f32 test.s16
ffmpeg -f s16le -ar 16k -ac 1 -i test.s16 test-out.wav

But the audio is bad:

superhg2012 · 2019-08-21T11:26:48Z

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

lmingde · 2019-08-21T13:05:57Z

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

NO, I just use the feature by dump_data extract from audio, I will try to train T2 to predict the feature(f.32), Are you use the feature (f.32) to synthesize?

lmingde · 2019-08-21T13:07:39Z

另外，我觉得这个LPCNet需要更新，我用这个LPCNet 合成 dump_data（taco 状态下）抽取的特征效果特别差，但是用最新版本的LPCNet效果就非常好。

ysujiang · 2020-05-22T06:48:14Z

@lmingde 您好，有些问题冒昧请教一下，我用自己的数据集训练了LPCNET和tacotron2，语音合成时语音质量特别差，语音不清晰部分失真，训练LPCNET时，epoch=1和epoch=120的loss差很小，请问您LPCNet训练的时候是否有类似的发生，您最终模型的损失大概是多少？另外您说的最新版本的LPCNet是用的那个版本的，方便给个网址吗

lmingde · 2020-05-25T01:01:49Z

@ysujiang 最新的在LPCnet论文中给的链接上.

ysujiang · 2020-06-02T11:35:04Z

@lmingde 您可以给我发个链接吗？或者能发一下论文的题目吗？谢谢

ysujiang · 2020-07-06T06:26:39Z

@lmingde hello,您训练得模型有颤音吗？
目前我训练的模型生成的语音样本有颤音，您是否遇到过这样的问题？您对taco和lpcnet有做改动吗

JunenuJ · 2020-09-06T15:58:05Z

@lmingde 您好，有些问题冒昧请教一下，我用自己的数据集训练了LPCNET和tacotron2，语音合成时语音质量特别差，语音不清晰部分失真，训练LPCNET时，epoch=1和epoch=120的loss差很小，请问您LPCNet训练的时候是否有类似的发生，您最终模型的损失大概是多少？另外您说的最新版本的LPCNet是用的那个版本的，方便给个网址吗
你好，我想请问，你是单独训练LPCNet和tacotron2吗，我对此还不是很清楚，希望给个明确的指导，谢谢

JunenuJ · 2020-09-08T07:16:06Z

@lmingde hello,您训练得模型有颤音吗？
目前我训练的模型生成的语音样本有颤音，您是否遇到过这样的问题？您对taco和lpcnet有做改动吗

你好，我看代码中，用GL合成语音，你用GL合成了嘛？我合成的效果很差，还想请教一下，你是如何用LPCNet合成的？使用原始的特征？还是tacotron2 预测的特征呢?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The audio which sythesize by feature file(.f32) is bad ? #6

The audio which sythesize by feature file(.f32) is bad ? #6

lmingde commented Aug 20, 2019

superhg2012 commented Aug 21, 2019

lmingde commented Aug 21, 2019

lmingde commented Aug 21, 2019

ysujiang commented May 22, 2020

lmingde commented May 25, 2020

ysujiang commented Jun 2, 2020

ysujiang commented Jul 6, 2020

JunenuJ commented Sep 6, 2020

JunenuJ commented Sep 8, 2020

The audio which sythesize by feature file(.f32) is bad ? #6

The audio which sythesize by feature file(.f32) is bad ? #6

Comments

lmingde commented Aug 20, 2019

superhg2012 commented Aug 21, 2019

lmingde commented Aug 21, 2019

lmingde commented Aug 21, 2019

ysujiang commented May 22, 2020

lmingde commented May 25, 2020

ysujiang commented Jun 2, 2020

ysujiang commented Jul 6, 2020

JunenuJ commented Sep 6, 2020

JunenuJ commented Sep 8, 2020