-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial results on the gpu batch recognizer #1539
Comments
Hello. It is possible but not implemented |
Thanks! it's good to know that it's possible in principle. Can you give me a hint? Will such implementation affect only vosk-api or kaldi too? And maybe give me a direction to look in? I want to try to implement this feature. |
Thanks, I'll let you know when I get something |
Hi. Sorry for the delay. I have created a pull-request: #1554 I added partial results, but I don't know how to link it to other languages, so only in c, and added an example On tests I got a limit of about 510-530 realtime streams from several test files on the rtx2080ti at about 15-20% of the i7-8700 Problems I noticed: it crashes when removing the model when removing the cuda pipeline instance, but I didn't look deeply into kaldi Line 128 in 40937b6
|
Hello! Question about the topic of the issue. Are there any plans to develop the Python code? |
Hi. No plans for that. We are moving to pytorch models and they don't have partial results as well. |
Hi! Great project, especially excited about the gpu support.
But i have a question, is it possible to use something like PartialResult() when working on gpu (rtx2080ti, cuda12.3), as it is done in websocket/asr_server.py?
For example, in a real-time audio stream analysis scenario, which is perfectly handled by asr server running on the cpu, but would like more performance than a cpu can provide.
Best Regards.
The text was updated successfully, but these errors were encountered: