You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 9, 2023. It is now read-only.
Hi Mr. Rolczynski,
I have tried to generate a new model with a new training dataset(code is as below which is same as you have mentioned in github)
dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121123-dev.csv', batch_size=25)
dev_dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121550-dev.csv', batch_size=25)
The training resulted in some files, which are as below in the checkpoint directory
'alphabet.bin, , 'decoder.bin', 'feature_extractor.bin', and 'model.h5'
But my question is how to load the model which i have just created. I believe the code which you have provided to test a pre -trained model (below) works only with deep speech model and not my own model.
file = 'to/test/sample.wav' # sample rate 16 kHz, and 16 bit depth
sample = asr.utils.read_audio(file)
pipeline = asr.load('deepspeech2', lang='en')
pipeline.model.summary() # TensorFlow model
sentences = pipeline.predict([sample])
Can you please help me to resolve this. I really appreciate your effort in helping the larger audience to get the knowledge of how speech to text recognition works.
The text was updated successfully, but these errors were encountered:
Hi Mr. Rolczynski,
I have tried to generate a new model with a new training dataset(code is as below which is same as you have mentioned in github)
dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121123-dev.csv', batch_size=25)
dev_dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121550-dev.csv', batch_size=25)
alphabet = asr.text.Alphabet(lang='en')
features_extractor = asr.features.FilterBanks(
features_num=160,
winlen=0.02,
winstep=0.01,
winfunc=np.hanning
)
model = asr.model.get_deepspeech2(
input_dim=160,
output_dim=29,
rnn_units=800,
is_mixed_precision=False
)
optimizer = tf.optimizers.Adam(
lr=1e-4,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-8
)
decoder = asr.decoder.GreedyDecoder()
pipeline = asr.pipeline.CTCPipeline(
alphabet, features_extractor, model, optimizer, decoder
)
pipeline.fit(dataset, dev_dataset, epochs=25)
pipeline.save('C:/Users/XXXX/Automatic-Speech-Recognition-master/Automatic-Speech-Recognition-master/automatic_speech_recognition/checkpoint/')
The training resulted in some files, which are as below in the checkpoint directory
'alphabet.bin, , 'decoder.bin', 'feature_extractor.bin', and 'model.h5'
But my question is how to load the model which i have just created. I believe the code which you have provided to test a pre -trained model (below) works only with deep speech model and not my own model.
file = 'to/test/sample.wav' # sample rate 16 kHz, and 16 bit depth
sample = asr.utils.read_audio(file)
pipeline = asr.load('deepspeech2', lang='en')
pipeline.model.summary() # TensorFlow model
sentences = pipeline.predict([sample])
Can you please help me to resolve this. I really appreciate your effort in helping the larger audience to get the knowledge of how speech to text recognition works.
The text was updated successfully, but these errors were encountered: