Transcribe multiple files from wav to text with timestamps using Azure Speech Recognition API
- A subscription key for the Speech service. See Try the speech service for free.
- Files prepared in format of wav,
16kHz sample rate
&mono
- run
npm install
- replace
SUBSCRIPTION_KEY
&SERVICE_REGION
with your Azure config - run
node index.js
example-*.wav.txt
output files are generated for default audio files
- Put audio files in
wav
directory - replace
SPEECH_RECOGNITION_LANGUAGE
with language your files are in
Connects to Azure, runs this script and transcribes files step-by-step.
For 1h long audio file, Azure needs about 20min to finish. Multiple files are transcripted in paralell, so fiveteen 1 hour files would still take about 20minutes. Each recognized sentence is incrementally appended to its newly created
source_file.wav.txt
file.
Notes:
- On MacOS you can easily convert to 16kHz & mono using iTunes (music.app) (https://support.apple.com/en-us/HT204310)
txt
files with the same name will be overriden on script run- Pull Requests adding support for proggramatic
mp3 -> wav
conversion are welcome