Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The audio which sythesize by feature file(.f32) is bad ? #6

Open
lmingde opened this issue Aug 20, 2019 · 9 comments
Open

The audio which sythesize by feature file(.f32) is bad ? #6

lmingde opened this issue Aug 20, 2019 · 9 comments

Comments

@lmingde
Copy link

lmingde commented Aug 20, 2019

I extract feature file(.f32) by modify /Tacotron-2/preprocessor.py(link)

# I change the preprocessor.py line 133:
    feature_name = 'feature-{}.f32'.format(wavfile)
    mel_filename = 'mel-{}.npy'.format(wavfile)
    linear_filename = 'linear-{}.npy'.format(wavfile)

    mel_spectrogram = mel_spectrogram.T
    np.save(os.path.join(mel_dir, mel_filename), mel_spectrogram, allow_pickle=False)
    mel_spectrogram = mel_spectrogram.reshape((-1,))
    mel_spectrogram.tofile(os.path.join(feature_dir,feature_name))
    np.save(os.path.join(linear_dir, linear_filename), linear_spectrogram.T, allow_pickle=False)

and I use the commands:

make test_lpcnet taco=1 # Define TACOTRON2 macro
./test_lpcnet test_features.f32 test.s16
ffmpeg -f s16le -ar 16k -ac 1 -i test.s16 test-out.wav

But the audio is bad:
lpctron

@superhg2012
Copy link

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

@lmingde
Copy link
Author

lmingde commented Aug 21, 2019

I met same issue, have you solved it? tacotron2 predicted features for lpcnet is not accurate.

NO, I just use the feature by dump_data extract from audio, I will try to train T2 to predict the feature(f.32), Are you use the feature (f.32) to synthesize?

@lmingde
Copy link
Author

lmingde commented Aug 21, 2019

另外,我觉得这个LPCNet需要更新,我用这个LPCNet 合成 dump_data(taco 状态下)抽取的特征效果特别差,但是用最新版本的LPCNet效果就非常好。

@ysujiang
Copy link

@lmingde 您好,有些问题冒昧请教一下,我用自己的数据集训练了LPCNET和tacotron2,语音合成时语音质量特别差,语音不清晰部分失真,训练LPCNET时,epoch=1和epoch=120的loss差很小,请问您LPCNet训练的时候是否有类似的发生,您最终模型的损失大概是多少?另外您说的最新版本的LPCNet是用的那个版本的,方便给个网址吗

@lmingde
Copy link
Author

lmingde commented May 25, 2020

@ysujiang 最新的在LPCnet论文中给的链接上.

@ysujiang
Copy link

ysujiang commented Jun 2, 2020

@lmingde 您可以给我发个链接吗?或者能发一下论文的题目吗?谢谢

@ysujiang
Copy link

ysujiang commented Jul 6, 2020

@lmingde hello,您训练得模型有颤音吗?
目前我训练的模型生成的语音样本有颤音,您是否遇到过这样的问题?您对taco和lpcnet有做改动吗

@JunenuJ
Copy link

JunenuJ commented Sep 6, 2020

@lmingde 您好,有些问题冒昧请教一下,我用自己的数据集训练了LPCNET和tacotron2,语音合成时语音质量特别差,语音不清晰部分失真,训练LPCNET时,epoch=1和epoch=120的loss差很小,请问您LPCNet训练的时候是否有类似的发生,您最终模型的损失大概是多少?另外您说的最新版本的LPCNet是用的那个版本的,方便给个网址吗
你好,我想请问,你是单独训练LPCNet和tacotron2吗,我对此还不是很清楚,希望给个明确的指导,谢谢

@JunenuJ
Copy link

JunenuJ commented Sep 8, 2020

@lmingde hello,您训练得模型有颤音吗?
目前我训练的模型生成的语音样本有颤音,您是否遇到过这样的问题?您对taco和lpcnet有做改动吗

你好,我看代码中,用GL合成语音,你用GL合成了嘛?我合成的效果很差,还想请教一下,你是如何用LPCNet合成的? 使用原始的特征?还是tacotron2 预测的特征呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants