Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔰 Request: Support Importing WhisperX JSON #28

Open
natelawrence opened this issue May 24, 2024 · 1 comment
Open

🔰 Request: Support Importing WhisperX JSON #28

natelawrence opened this issue May 24, 2024 · 1 comment

Comments

@natelawrence
Copy link

natelawrence commented May 24, 2024

In a recent hunt for more ASR providers who offer per-word timecodes I found some that I already knew of and a few I hadn't heard of before. Among all providers is WhisperX.

We are all familiar with OpenAI's Whisper technology, however those default models only produce timecodes for long phrases of words and the timecodes are not very accurate.
WhisperX is a fork of Whisper that provides timecodes of greater accuracy with beginning and end timecodes for every word in the transcript.

You can generate test data with a free demo of WhisperX.

Or, if you would like test data that is already generated:
ASR Timed Text Format Test 2 [WhisperX].json
The corresponding audio file can be obtained here.

Being able to import WhisperX's format would allow WhisperX users to bring their transcripts and edit them in HyperAudio Lite Editor, if desired.

@natelawrence
Copy link
Author

natelawrence commented May 25, 2024

Also see: From HyperAudio Lite Editor Issues:
🔰 Integrate replicate.com WhisperX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant