Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛[BUG]: GraphCast example fails with unexpected keyword argument 'mesh_level' #670

Open
benkirk opened this issue Sep 12, 2024 · 2 comments
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@benkirk
Copy link

benkirk commented Sep 12, 2024

Version

0.7.0

On which installation method(s) does this occur?

Docker

Describe the issue

I'm attempting to run the examples/weather/graphcast example with nvidia-modulus 0.7.0 running under the NGC 24.07 image, and recieve an unexpected argument error:

Apptainer> python train_graphcast.py wb_mode=disabled synthetic_dataset=true
/usr/local/lib/python3.10/dist-packages/modulus/distributed/manager.py:346: UserWarning: Could not initialize using ENV, SLURM or OPENMPI methods. Assuming this is a single process job
  warn(
[08:38:48 - main - INFO] Rank: 0, Device: cuda:0
[08:38:48 - main - WARNING] Using synthetic dataset. Ignoring static dataset, cosine zenith angle, time of the year, and history. Also setting num_workers to 0.
[08:38:48 - main - INFO] Using torch.bfloat16 dtype
[08:38:48 - main - WARNING] Static dataset path is not provided. Setting num_channels_static to 0.
Error executing job with overrides: ['wb_mode=disabled', 'synthetic_dataset=true']
Traceback (most recent call last):
  File "/glade/work/benkirk/repos/csg-utils/hpc-demos/containers/AI_ML/NGC/apptainer/modulus/modulus/examples/weather/graphcast/train_graphcast.py", line 363, in main
    trainer = GraphCastTrainer(cfg, dist, rank_zero_logger)
  File "/glade/work/benkirk/repos/csg-utils/hpc-demos/containers/AI_ML/NGC/apptainer/modulus/modulus/examples/weather/graphcast/train_graphcast.py", line 105, in __init__
    self.model = GraphCastNet(
  File "/usr/local/lib/python3.10/dist-packages/modulus/models/module.py", line 65, in __new__
    bound_args = sig.bind_partial(
  File "/usr/lib/python3.10/inspect.py", line 3193, in bind_partial
    return self._bind(args, kwargs, partial=True)
  File "/usr/lib/python3.10/inspect.py", line 3175, in _bind
    raise TypeError(
TypeError: got an unexpected keyword argument 'mesh_level'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

(The same issue occurs via pip install.)

Any help would be appreciated!

Minimum reproducible example

python train_graphcast.py wb_mode=disabled synthetic_dataset=true

Relevant log output

No response

Environment details

No response

@benkirk benkirk added ? - Needs Triage Need team to review and classify bug Something isn't working labels Sep 12, 2024
@benkirk
Copy link
Author

benkirk commented Sep 12, 2024

I can verify the previous commit bccede0 at least starts running.

@mnabian
Copy link
Collaborator

mnabian commented Oct 17, 2024

Hi @benkirk , thanks for reporting the issue. As you rightly mentioned, this has been fixed in the latest GraphCast commits.
Can we close this issue?

@mnabian mnabian self-assigned this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants