Skip to content

Repository to train voice models with new speech recognition models using KALDI.

License

Notifications You must be signed in to change notification settings

NeuroLexDiagnostics/train-ASR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

train-ASR

Repository to train voice models with new speech recognition models using KALDI.

team members

Active members of the team working on this repo include:

  • Dhruv Rajani (Arizona State University)
  • Jim Schwoebel (Boston, MA)

how to download data

We have a custom phoneme dataset that can be accessed here.

goals / what to beat

The ASR project is basically to help build an ASR model from a database of phonemes that we have assembled. I have created a training dataset of about 40 phonemes to help recognize my own voice; so that could be a starting point. But there are other programs like KALDI (https://github.com/pykaldi/pykaldi) where you can build custom ASR models based on pre-trained models using things like GMMs and other acoustic and language models (these correct for things such as speaker frequencies and format lengths - the length of the intratrachael tube so that model is not overfitted to one speaker). The goal here is to sort of look deeper into these technologies and build a minimum viable ASR that could have a tailored vocabulary (this may be important for some of the surveys we're working on, especially for medical words in transcription).

references

About

Repository to train voice models with new speech recognition models using KALDI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published