You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey there!
First of all: Nice tutorial. Easy to follow and well-explained.
Yet, I encountered some errors on the first trials.
I have one question and two suggestions/comments for improvement.
1. What does line 61 in main.py really do? I could not figure out its sense. So I adjusted it for my needs accordingly (see bullet point 2)
2. I adjusted the code section in which we pass the sample rate to mfcc_hires.conf. I added the strip()-method to line 60 as the code was throwing an error on the first execution, as I had trailing spaces. So my suggestion looks as follows:
# Reformat the line to use the sample rate of the .wav file
line = line.strip().split("=")
print("list of line elements in mfcc_hires.conf file: ", line)
line[1] = sample_rate # overwrites the sample rate in the list 'line' at index position '1'
myseparator="="
line = myseparator.join(line)
3. I created a Kaldi-like 'text' file as the decoding step did not work without this file.
The text was updated successfully, but these errors were encountered:
Hi @kak-to-tak, this text file contains transcriptions of each utterance in the audio file. If speaker information in your project setup is available, then the structure of each line in this 'text' file could may have the following structure: <speaker_id>_<utterance_ID> <transcription of each sentence/segment if you have segmented the audio file>. Check the Kaldi dummy tutorial here to get an idea of it. Usually, you have to prepare such a training file manually and make the transcription of the file. Why? You need to train the algorithm. If you do not train the 'brain' and feed it with transcriptions the algo will not learn how to transcribe.
Check also this Kaldi tutorial to get a glimpse of the functioning of Kaldi.
Hey there!
First of all: Nice tutorial. Easy to follow and well-explained.
Yet, I encountered some errors on the first trials.
I have one question and two suggestions/comments for improvement.
1. What does line 61 in main.py really do? I could not figure out its sense. So I adjusted it for my needs accordingly (see bullet point 2)
2. I adjusted the code section in which we pass the sample rate to mfcc_hires.conf. I added the strip()-method to line 60 as the code was throwing an error on the first execution, as I had trailing spaces. So my suggestion looks as follows:
3. I created a Kaldi-like 'text' file as the decoding step did not work without this file.
The text was updated successfully, but these errors were encountered: