You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a recent hunt for more ASR providers who offer per-word timecodes I found some that I already knew of and a few I hadn't heard of before. Among all providers is WhisperX.
We are all familiar with OpenAI's Whisper technology, however those default models only produce timecodes for long phrases of words and the timecodes are not very accurate. WhisperX is a fork of Whisper that provides timecodes of greater accuracy with beginning and end timecodes for every word in the transcript.
In a recent hunt for more ASR providers who offer per-word timecodes I found some that I already knew of and a few I hadn't heard of before. Among all providers is WhisperX.
We are all familiar with OpenAI's Whisper technology, however those default models only produce timecodes for long phrases of words and the timecodes are not very accurate.
WhisperX is a fork of Whisper that provides timecodes of greater accuracy with beginning and end timecodes for every word in the transcript.
You can generate test data with a free demo of WhisperX.
Or, if you would like test data that is already generated:
ASR Timed Text Format Test 2 [WhisperX].json
The corresponding audio file can be obtained here.
Being able to import WhisperX's format would allow WhisperX users to bring their transcripts and edit them in HyperAudio Lite Editor, if desired.
The text was updated successfully, but these errors were encountered: