Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] CheckpointHook After save_best is set, running val alone will cause an error #1587

Open
2 tasks done
BayMaxBHL opened this issue Oct 20, 2024 · 0 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@BayMaxBHL
Copy link
Contributor

BayMaxBHL commented Oct 20, 2024

Prerequisite

Environment

All environments

Reproduces the problem - code sample

checkpoint=dict(
    type="CheckpointHook",
    by_epoch=False,
    interval=2000,
    max_keep_ckpts=1,
    save_best=["DepthMetric/abs_rel", "DepthMetric/rmse"],
    rule=["less", "less"],
),

runner.val()

Reproduces the problem - command or script

When I need to run val once (Debug), the code will tell me at the end that I have no save_best history

Reproduces the problem - error message

10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][124/125] eta: 0:00:00 time: 0.0564 data_time: 0.0008 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] eta: 0:00:00 time: 0.0561 data_time: 0.0007 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] DepthMetric/abs_rel: 0.7433 DepthMetric/sq_rel: 0.3333 DepthMetric/rmse: 0.4196 DepthMetric/rmse_log: 1.3231 DepthMetric/a1: 0.0834 DepthMetric/a2: 0.1743 DepthMetric/a3: 0.2699 data_time: 0.0170 time: 0.0793
Traceback (most recent call last):
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 56, in
demo()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 50, in demo
coal_dump_mde()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 35, in coal_dump_mde
run_command_script(
File "/home/baihanlin/Project/UMDE/DreamDE/tools/run_task.py", line 41, in run_command_script
runner.val()
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1800, in val
metrics = self.val_loop.run() # type: ignore
^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/loops.py", line 377, in run
self.runner.call_hook('after_val_epoch', metrics=metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1839, in call_hook
getattr(hook, fn_name)(self, **kwargs)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 361, in after_val_epoch
self._save_best_checkpoint(runner, metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 514, in _save_best_checkpoint
best_ckpt_path = self.best_ckpt_path_dict[key_indicator]
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'DepthMetric/abs_rel'

Additional information

No response

@BayMaxBHL BayMaxBHL added the bug Something isn't working label Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant