You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When loading the GradualWarmupScheduler from a state dict to resume a training, the optimizer attribute of the nested after_scheduler is loaded from the state_dict. This causes a static learning rate after resuming a training, as the after_scheduler tries to update the learning rate of an optimizer that doesn't match the one used by the resumed training. Setting self.after_scheduler.optimizer = self.optimizer as a part of the load_state_dict() method should probably suffice to fix this.
The text was updated successfully, but these errors were encountered:
When loading the
GradualWarmupScheduler
from a state dict to resume a training, theoptimizer
attribute of the nestedafter_scheduler
is loaded from thestate_dict
. This causes a static learning rate after resuming a training, as theafter_scheduler
tries to update the learning rate of an optimizer that doesn't match the one used by the resumed training. Settingself.after_scheduler.optimizer = self.optimizer
as a part of theload_state_dict()
method should probably suffice to fix this.The text was updated successfully, but these errors were encountered: