Investigate Word-level timestamps for discarding processed audio #330

makaveli10 · 2025-01-21T08:55:10Z

In the faster-whisper backend, current impl considers the last segment as incomplete to be able to correctly discard processed audio by assuming the last segment might have a word cut off.
The idea is to use word level timestamps and just keep the audio for the last word.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Word-level timestamps for discarding processed audio #330

Investigate Word-level timestamps for discarding processed audio #330

makaveli10 commented Jan 21, 2025

Investigate Word-level timestamps for discarding processed audio #330

Investigate Word-level timestamps for discarding processed audio #330

Comments

makaveli10 commented Jan 21, 2025