Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cosine restart learning rate #2953

Open
wants to merge 3 commits into
base: devel
Choose a base branch
from

Conversation

hellozhaoming
Copy link

No description provided.

@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

Attention: 39 lines in your changes are missing coverage. Please review.

Comparison is base (2fe6927) 75.36% compared to head (05052c1) 75.07%.
Report is 16 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #2953      +/-   ##
==========================================
- Coverage   75.36%   75.07%   -0.30%     
==========================================
  Files         245      220      -25     
  Lines       24648    20297    -4351     
  Branches     1582      903     -679     
==========================================
- Hits        18577    15238    -3339     
+ Misses       5140     4526     -614     
+ Partials      931      533     -398     
Files Coverage Δ
deepmd/common.py 83.65% <ø> (ø)
deepmd/utils/argcheck.py 96.16% <100.00%> (+0.08%) ⬆️
deepmd/train/trainer.py 84.56% <54.54%> (-0.50%) ⬇️
deepmd/utils/learning_rate.py 48.57% <22.72%> (-43.74%) ⬇️

... and 62 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

"""Get the start lr."""
return self.start_lr_

def value(self, step: int) -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may not need to implement the value method if you do not print the information regarding the learning rate at the beginning of the training:
https://github.com/hellozhaoming/deepmd-kit/blob/05052c195308f61b63ce2bab130ce0e8cba60604/deepmd/train/trainer.py#L566

@wanghan-iapcm wanghan-iapcm changed the base branch from master to devel October 27, 2023 13:00
Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run pre-commit to format and lint the code: https://docs.deepmodeling.com/projects/deepmd/en/master/development/coding-conventions.html#run-scripts-to-check-the-code. Or you can submit from a non-protect branch and pre-commit.ci can do it for you.

Unit tests should be added for two new learning rate classes.

@@ -125,6 +125,7 @@ def gelu_wrapper(x):
"softplus": tf.nn.softplus,
"sigmoid": tf.sigmoid,
"tanh": tf.nn.tanh,
"swish": tf.nn.swish,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it has been renamed to silu: tensorflow/tensorflow#41066

)
else:
for fitting_key in self.fitting:
if self.lr_type == "exp":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a good behavior to switch the learning rate in the Trainer. Instead, implement the method LearningRate.log_start (LearningRate should be an abstract base class and inherited by all learning rate classes) and call self.lr.log_start(self.sess) here.

[Argument("exp", dict, learning_rate_exp())],
[Argument("exp", dict, learning_rate_exp()),
Argument("cos", dict, learning_rate_cos()),
Argument("cosrestart", dict, learning_rate_cosrestarts())],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to add some documentation to variants (doc="xxx"). Otherwise, no one knows what they are.

Comment on lines +113 to +118
```python
global_step = min(global_step, decay_steps)
cosine_decay = 0.5 * (1 + cos(pi * global_step / decay_steps))
decayed = (1 - alpha) * cosine_decay + alpha
decayed_learning_rate = learning_rate * decayed
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The function returns the cosine decayed learning rate while taking into account
possible warm restarts.
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants