-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 1.1.0 has onnxruntime thread affinity crash #1169
Comments
Can you limit the number of threads here and try again? |
Which API is available to set |
Just change it in vad.py to:
|
Im running the code on a docker environment which just pulls in Or am I missing a third option? |
No third option currently, I just want you to test the fix first before we actually take any steps to fix |
Tried monkey patching, this does remove the onnxruntime error but the OOM error still persisted. It turned out to be Im now using This is my current packages in case someone else runs into the issue:
monkey patch:
|
I think it should be
|
the error he's mentioning is only caused when the value is 0 since that means onnx must infer the actual number and it fails to do so, any fixed number should fix the error, setting it to 1 should be the safest but not the fastest Also VAD encoder now benefits from GPU acceleration if anyone needs it |
Reported problems: SYSTRAN#1193 SYSTRAN#1169 VAD implementations consumes humongous memory amounts [original Silero doesn't have this problem] This PR should fix the OOM problem. Alt solution could be removing 'lru_cache'.
@Appfinity-development |
Reported problems: SYSTRAN#1193 SYSTRAN#1169 VAD implementations consumes humongous memory amounts [original Silero doesn't have this problem] This PR should fix the OOM problem. Alt solution could be removing 'lru_cache'.
Updated from 1.0.3 to 1.1.0. Now an onnxruntime thread affinity crash occurs each time. Both versions run on a Nvidia A40 with 4 CPU cores, 48GB VRAM and 16GB RAM (on a private Replicate server). Shouldn't be a hardware issue. Our model config:
Also tried this:
But to no avail. Any suggestions? Below the crash log.
The cog.yaml with dependencies looks like this:
Also tried removing the onnxruntime dependency or setting it to a specific gpu version. But nothing fixes the issue. Anyone with ideas (@MahmoudAshraf97) ?
If the
cpu
is used asdevice
onWhisperModel
the onnxruntime error still shows in the logs but there is no crash and transcribing finishes successfully.The text was updated successfully, but these errors were encountered: