Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Bug in CPN? #3

Closed
ppriyank opened this issue Oct 28, 2021 · 2 comments
Closed

Potential Bug in CPN? #3

ppriyank opened this issue Oct 28, 2021 · 2 comments
Assignees

Comments

@ppriyank
Copy link

https://github.com/FZJ-INM1-BDA/celldetection/blob/main/demos/Cell%20Detection%20with%20Contour%20Proposal%20Networks.ipynb

In the above tutorial, when I replace cpn='CpnU22', in conf = cd.Config(....)
After 1 epoch of training, on the second epoch I get the following error :

Epoch 2/100 - loss 12.061:  56%|███████████████████████████████████████████████████▍                                        | 286/512 [02:23<01:54,  1.98it/s]
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [55,0,0], thread: [64,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "ind
ex out of bounds"` failed.                                                    
...
...
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [85,0,0], thread: [94,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [85,0,0], thread: [95,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
...
...
...

Traceback (most recent call last):
  File "train2.py", line 231, in <module>
    train_epoch(model, train_loader, conf.device, optimizer, f'Epoch {epoch}/{conf.epochs}', scaler, scheduler)
  File "train2.py", line 212, in train_epoch
    outputs: dict = model(batch['inputs'], targets=batch)
  File "/home/ppriyank/anaconda3/envs/pathak/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ppriyank/covid_cell/celldetection/celldetection/models/cpn.py", line 441, in forward
    buckets = resolve_refinement_buckets(sampling, self.core.refinement_buckets)
  File "/home/ppriyank/covid_cell/celldetection/celldetection/ops/cpn.py", line 203, in resolve_refinement_buckets
    (a % num_buckets, refinement_bucket_weight(a, base_index)),
  File "/home/ppriyank/covid_cell/celldetection/celldetection/ops/cpn.py", line 193, in refinement_bucket_weight
    dist[sel] = 0
RuntimeError: CUDA error: device-side assert triggered
@ericup
Copy link
Collaborator

ericup commented Oct 28, 2021

Thank you for your feedback!
I suspect this is a duplicate of #1.
Would you please confirm this by testing the suggested workaround?

@ericup ericup self-assigned this Oct 28, 2021
@ppriyank
Copy link
Author

Oh yeh, this is the same issue, apologies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants