Separating LLM model GPU from TTS model GPU #6068
Replies: 2 comments
-
This sounds like a question for alltalk_tts. Basically change everything that says More info |
Beta Was this translation helpful? Give feedback.
-
Yea, its not 0,1,2,3,4, I.E. 1 means 0-1. IIRC. i haven't done SLI in a long time, but, thats how i understood the env variable. My advice would be, since AllTalk respects the "CUDA_VISIBLE_DEVICES=x" env var, is to: 2- Go to the script.py for alltalk and assign a lower desired CUDA index, for 1 card, use 0, 2=1, and so on. 3- do so for any other extensions desire to segregate CUDA I.E. Server.py/One-click would have set CUDA_VISIBLE_DEVICES=3 and AllTalk's script.py would have set CUDA_VISIBLE_DEVICES=0 near the top before hooking to CUDA. There may be some cross talk but, I.E. in the above, Main webui Would dip into alltalks CUDA, but, it's close. One would hope it would get the idea of what the user is trying to do but meh, thats hoping. Edit: oh yeah, don't' forget, to remove set CUDA_VISIBLE_DEVICES=x from global env altogether or separate from your env completely, as of course, a globally visible instance would make this method do nothing. |
Beta Was this translation helpful? Give feedback.
-
I have several Nvidia graphics cards. I can run the webui on any combination of cards no problem by setting:
export CUDA_VISIBLE_DEVICES=0,1,2,3
But that sets the same GPUs for all tasks. I can put for example:
export CUDA_VISIBLE_DEVICES=0
in my startup_linux.sh file as the first line and the webui will start up and load everything on GPU 0. but now I want to load alltalk_tts on a seperate GPU. But if I put for example export CUDA_VISIBLE_DEVICES=1 in launch.sh like:
it still just uses GPU 0 to load and run the TTS model.
Does anyone know how I can use one set of GPU for the main LLM and another for the TTS and STT models? I think I can run alltalk in a docker container but didn't want to have to go down that road as I finally for it running as an extension inside oobabooga.
Beta Was this translation helpful? Give feedback.
All reactions