Long training time for detection #7619
Unanswered
Thibescobar
asked this question in
Q&A
Replies: 2 comments 7 replies
-
Hi @Can-Zhao, would you mind sharing your insights on this matter? |
Beta Was this translation helpful? Give feedback.
5 replies
-
I found out that a colleague had the same problem but using nnDetection and fixed it by inputing zarr files instead of nii.gz ones. Does it sounds adaptable to the bundle use of MONAI detection? If yes, could you give me an hint please? Have a nice day. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, here are details of the long training time problem I face related to the previous post here: Project-MONAI/model-zoo#577
I am using the model zoo's lung nodule ct detection bundle to train the other folds (trained fold 0 model already given by the zoo): https://monai.io/model-zoo.html
Facing this long training time problem, I wanted to ensure it does not come from the bundle usage, so I followed the Python script way, given by the lung nodule detection tutorial here: https://github.com/Project-MONAI/tutorials/tree/main/detection
Unfortunately, the time is the same... So I investigated the code and added time markers using
print()
.In the following piece of code, in case of AMP usage, execution is long at the line after the time markers "5.1 (amp)" and "5.4 (amp)" (~10s for the whole iteration). When not using AMP (amp = False), the stucks are at "5.1 (no amp)" and "6", even longer for the whole iteration (~120s). Can you help me please to find out why, and how to fix if possible, or give me some hints ?
My configuration is:
Computer: Laptop Dell Precision 7670
OS: Windows 10 Professional (22H2)
System type: x64
GPU: NVIDIA RTX A3000 12GB
CPU: 12th Gen Intel(R) Core(TM) i7-12850HX 2.10 GHz
RAM: 32GB
Python version: 3.10.14
MONAI version: 1.3.0
MONAI Weekly version: 1.4.dev2414
Pytorch version: torch 2.2.2+cu118
cuDNN version:
- 8.7 given by
conda activate monailuna && python >>> import torch >>> torch.backends.cudnn.version()
(I think this is this one)- 8.6 installed outside the active conda env at
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDNN
CUDA version:
- 11.8 given by
conda activate monailuna && python >>> import torch >>> torch.version.cuda
(I think this is this one)- 11.7 given by
nvcc --version
(version installed outside the active conda env)- 12.2 given by
nvidia-smi
(compatible version but not the one installed?)Thank you very much in advance.
Beta Was this translation helpful? Give feedback.
All reactions