We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is long form inference possible with whisper_trt ? I tried inference on 4m16s audio clip and it appeared to only transcribe 30s, here is my script:
whisper_trt
from whisper_trt import load_trt_model model = load_trt_model("small.en") result = model.transcribe("test.wav")
The text was updated successfully, but these errors were encountered:
Hi @eschmidbauer ,
It should be possible, but seems like we'll need to make some modifications to the transcribe function:
transcribe
whisper_trt/whisper_trt/model.py
Line 162 in 268eff1
Currently, it runs on a single 30s window.
John
Sorry, something went wrong.
It would be great to demonstrate long-form here perhaps by using sliding window
No branches or pull requests
Is long form inference possible with
whisper_trt
?I tried inference on 4m16s audio clip and it appeared to only transcribe 30s, here is my script:
The text was updated successfully, but these errors were encountered: