You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 9, 2023. It is now read-only.
Alala... we have to fix it because the multi_gpu_model function is deprecated on 1 April, more details here.
As you can see in this link, we have to use the MirrorStrategy in our case. If you want to help, please make the Pull Request. Otherwise, the hotfix is to create the model within the context of the strategy. Please read the detailed tutorial directly on the TensorFlow page here
Please tell me if you can fix it.
rolczynski
changed the title
How can i use multi gpus
TensorFlow multi_gpu_model function is deprecated
Apr 8, 2020
Hi @rolczynski
Seems that the usage of target_tensors option when compiling the model is not supported in tf MirrorStrategy.
Do you have any idea how can usage of target_tensors option can be avoided and model mapped to the right target automatically?
In my experiments, just disabling target_tensors option throws a ctc loss related error when compiling the model. This is the snippet I am mentioning:
def compile_model(self): """ The compiled model means the model configured for training. """ y = keras.layers.Input(name='y', shape=[None], dtype='int32') loss = self.get_loss() self._model.compile(self._optimizer, loss, target_tensors=[y]) logger.info("Model is successfully compiled")
It's quite a big change. As we can see, things tend to be more and more complicated when we try to stick with the "functional API". We wish to build models effortlessly so I think we should do it by subclassing the tf.keras.Model class.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hello, i want to train my dataset, and i have two gpus.
bellwo is code.
`pipeline = asr.pipeline.CTCPipeline(
alphabet, features_extractor, model, optimizer, decoder, gpus=['gpu:0','gpu:1']
)
dataset =pipeline.wrap_preprocess(dataset, False, None)
dev_dataset =pipeline.wrap_preprocess(dev_dataset, False, None)
y = tf.keras.layers.Input(name='y', shape=[None], dtype='int32')
loss = pipeline.get_loss()
pipeline._model.compile(pipeline._optimizer, loss, target_tensors=[y])
pipeline._model.fit(dataset,validation_data=dev_dataset,epochs=100)
pipeline._model.save(os.path.join('/checkpoint', 'model.h5'))`
But, model use only one gpu.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 00000000:67:00.0 Off | N/A |
| 48% 78C P2 220W / 250W | 11861MiB / 12196MiB | 86% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp Off | 00000000:68:00.0 On | N/A |
| 27% 45C P8 12W / 250W | 574MiB / 12194MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
It seems that an OOM occurs when the batch size is increased.
The text was updated successfully, but these errors were encountered: