Replies: 4 comments
-
Beta Was this translation helpful? Give feedback.
-
>>> Sushantmkarande |
Beta Was this translation helpful? Give feedback.
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> Sushantmkarande |
Beta Was this translation helpful? Give feedback.
-
>>> Sushantmkarande
[April 19, 2019, 7:35am]
Hello,
I have successfully tried to train deepspeech 0.4.1 model on my own
dataset. these are some steps slash
downloaded mozilla common voice 22gb corpus for english. slash
I was going to create my new tsv. but could not able to figure out what
is client id in corpus tsv slash
so i just overwrite my own sentence in corpus tsv and replaced my mp3
file with corresponding tsv path name for 15 samples. slash
but this time I am going to create big data around 600 sample so my
question is what is client id in mozilla corpus or how do i create a big
sample data for this model is there any script available for the same.
2. accuracy on indian accent is very low. will it help if i retrain the
model using mozilla indian accent samples only which is already been
used to train the actual 0.4.1 model.
3. is there any preprocessing need to be done to minimize noise while
giving input as wav file to model to get prediction. I am using
pyaudio with this setting
CHUNK = 1024 slash
FORMAT = pyaudio.paInt16 slash
CHANNELS = 1 slash
RATE = 16000
[This is an archived TTS discussion thread from discourse.mozilla.org/t/need-some-clarification-on-training-on-already-pretrained-model]
Beta Was this translation helpful? Give feedback.
All reactions