Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARNING:tensorboardX.x2num:NaN or Inf found in input tensor. #368

Open
liuqiang1227 opened this issue Jun 18, 2024 · 1 comment
Open

WARNING:tensorboardX.x2num:NaN or Inf found in input tensor. #368

liuqiang1227 opened this issue Jun 18, 2024 · 1 comment

Comments

@liuqiang1227
Copy link

During the annual training process, a warning will be issued

$ python -m torch.distributed.launch --nproc_per_node=1 train.py --data /home/a/CustomMap --object yepian

Loading Model...
ready to train!
WARNING:tensorboardX.x2num:NaN or Inf found in input tensor.
Train Epoch: 1 [0/5000 (0%)] Loss: 0.035234406590462 Local Rank: 0
Train Epoch: 1 [1600/5000 (32%)] Loss: 0.004101661965251 Local Rank: 0
Train Epoch: 1 [3200/5000 (64%)] Loss: 0.002820491790771 Local Rank: 0
Train Epoch: 1 [4800/5000 (96%)] Loss: 0.002345604822040 Local Rank: 0
WARNING:tensorboardX.x2num:NaN or Inf found in input tensor.
Train Epoch: 2 [0/5000 (0%)] Loss: 0.003404101822525 Local Rank: 0
Train Epoch: 2 [1600/5000 (32%)] Loss: 0.003067462937906 Local Rank: 0
Train Epoch: 2 [3200/5000 (64%)] Loss: 0.003346179146320 Local Rank: 0
Train Epoch: 2 [4800/5000 (96%)] Loss: 0.003464318113402 Local Rank: 0

@TontonTremblay
Copy link
Collaborator

We had this warning forever and I never was able to figure out why. You can ignore it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants