Replies: 19 comments
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> ryoji.ysd |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin |
Beta Was this translation helpful? Give feedback.
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> vigneshgig |
Beta Was this translation helpful? Give feedback.
-
>>> vigneshgig |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> tbatkin
[November 27, 2019, 3:03pm]
We are running into an issue with trying to run multiple inferences in
parallel on a GPU. By using torch multiprocessing we have made a script
that creates a queue and run 'n' number of processes.
When setting 'n' to greater than 2 we run into errors to do with lack of
memory, from a bit of research on the discourse we've figured out that
this is due to tensorflow allocating all of the GPU memory to itself
when it initialises the session.
We know how to alter the 'use_allow_growth' flag in the flags.py which
as we understand is basically just adding changing the tf.ConfigProto()
to add
config.gpu_options.allow_growth = True
but that seems to only apply to the training method and not the
inference method.
How and where can we alter the tf.ConfigProto() to be able to utilise
this tensorflow method in order to be able to take full advantage of the
GPU memory with many multiple processes?
(This is using v0.5.1 and the pre-trained model associated with it)
[Multi-processsing inference with multi-GPU setup
[Unable to build deepspeech binary for v0.6.1 from
scratch
[This is an archived TTS discussion thread from discourse.mozilla.org/t/running-multiple-inferences-in-parallel-on-a-gpu]
Beta Was this translation helpful? Give feedback.
All reactions