Fully End-to-End Autmoatic Speech Recognition from Raw Audio
The aim of this project is to idenitfy and implement fully end-to-end Autmoatic Speech Recognition (ASRs) systems that operate on raw audio. A fully end-to-end ASR is one that operates on the raw audio input, instead of any extracted features (e.g., MFCCs).
How to use the code base
The scripts
folder contains example code that you use can change to build your own pipeline.
Implemented Papers:
[1] Analysis of CNN-based Speech Recognition System using Raw Speech as Input