You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am testing the library so I run the training using the poinnet.yaml on the S3DIS dataset (segmentation). The training went well for 100 epochs using a batch_size=2 on a RTX 3080. however, when the testing part started I found the following error:
[01/20 04:16:41 S3DIS]: Test [5]/[68] cloud
Test on 5-th cloud [20]/[72]]: 28%|████████████████████████████████████████████▍ | 20/72 [00:02<00:05, 9.00it/s]
Traceback (most recent call last):
File "examples/segmentation/main.py", line 745, in <module>
main(0, cfg)
File "examples/segmentation/main.py", line 308, in main
test_miou, test_macc, test_oa, test_ious, test_accs, _ = test(model, data_list, cfg)
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "examples/segmentation/main.py", line 598, in test
logits = model(data)
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hri-david/PycharmProjects/Pointnet/PointNeXt/examples/segmentation/../../openpoints/models/segmentation/base_seg.py", line 45, in forward
p, f = self.encoder.forward_seg_feat(data)
File "/home/hri-david/PycharmProjects/Pointnet/PointNeXt/examples/segmentation/../../openpoints/models/backbone/pointnet.py", line 170, in forward_seg_feat
trans = self.stn(x)
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hri-david/PycharmProjects/Pointnet/PointNeXt/examples/segmentation/../../openpoints/models/backbone/pointnet.py", line 36, in forward
x = F.relu(self.bn3(self.conv3(x)))
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 179, in forward
self.eps,
File "/home/hri-david/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/nn/functional.py", line 2283, in batch_norm
input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 876.00 MiB (GPU 0; 9.74 GiB total capacity; 1.28 GiB already allocated; 121.19 MiB free; 3.16 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
wandb: | 2.905 MB of 2.905 MB uploaded
wandb: Run history:
wandb: best_val ▁▂▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▇██████████████
wandb: global_step ▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
wandb: lr ████████▇▇▇▇▇▆▆▆▆▅▅▅▄▄▄▄▃▃▃▃▂▂▂▂▂▁▁▁▁▁▁▁
wandb: macc_when_best ▁▂▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇███████████████
wandb: oa_when_best ▁▁███████████████▆▆▇▇▇▇▇▇▇██████████████
wandb: train_loss █▅▅▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: train_macc ▁▃▄▄▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇███████████████
wandb: train_miou ▁▃▄▄▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇███████████████
wandb: val_macc ▃▃▇▄▅▆▃▄▇▆▅▆▄▄▇▆▆▇▆▇▅▁▇▁▆▄█▇▇▆▇▃▇▇▆▆▇▇▅▄
wandb: val_miou ▄▃▇▄▆▆▃▄▇▆▆▅▃▄▇▅▅▇▅▆▄▂▇▁▆▃█▇▆▅▆▃▇▆▅▆▇▆▅▃
wandb: val_oa ▆▅█▄▇▇▅▆▇▇▇▆▅▅▇▆▆▇▆▆▅▂▇▁▇▃█▇▇▅▇▃▇▇▆▆▇▇▆▃
wandb:
wandb: Run summary:
wandb: best_val 22.63091
wandb: global_step 100
wandb: lr 1e-05
wandb: macc_when_best 29.38019
wandb: oa_when_best 61.35135
wandb: train_loss 1.55627
wandb: train_macc 42.63173
wandb: train_miou 34.23775
wandb: val_macc 20.69266
wandb: val_miou 12.51122
wandb: val_oa 41.35226
wandb:
wandb: 🚀 View run s3dis-train-pointnet-ngpus1-20240119-195032-Y9EAMrwTdiBMMf9hkLf8 at: https://wandb.ai/dsdiazc/PointNeXt-S3DIS/runs/5cx3w4ln
wandb: ️⚡ View job at https://wandb.ai/dsdiazc/PointNeXt-S3DIS/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjEzMTk0MzY1NQ==/version_details/v0
wandb: Synced 6 W&B file(s), 0 media file(s), 2 artifact file(s) and 2 other file(s)
wandb: Find logs at: ./wandb/run-20240119_195033-5cx3w4ln/logs
Should I do some additional modification to the yaml file to make it work on my hardware (RTX 3080)?
Thank You in advance!
The text was updated successfully, but these errors were encountered:
Hi, Thanks for your work.
I am testing the library so I run the training using the poinnet.yaml on the S3DIS dataset (segmentation). The training went well for 100 epochs using a
batch_size=2
on a RTX 3080. however, when the testing part started I found the following error:Should I do some additional modification to the yaml file to make it work on my hardware (RTX 3080)?
Thank You in advance!
The text was updated successfully, but these errors were encountered: