-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add cosine restart learning rate #2953
base: devel
Are you sure you want to change the base?
Conversation
Signed-off-by: hellozhaoming <[email protected]>
Add cosine restart learning rate
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## devel #2953 +/- ##
==========================================
- Coverage 75.36% 75.07% -0.30%
==========================================
Files 245 220 -25
Lines 24648 20297 -4351
Branches 1582 903 -679
==========================================
- Hits 18577 15238 -3339
+ Misses 5140 4526 -614
+ Partials 931 533 -398
☔ View full report in Codecov by Sentry. |
"""Get the start lr.""" | ||
return self.start_lr_ | ||
|
||
def value(self, step: int) -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may not need to implement the value method if you do not print the information regarding the learning rate at the beginning of the training:
https://github.com/hellozhaoming/deepmd-kit/blob/05052c195308f61b63ce2bab130ce0e8cba60604/deepmd/train/trainer.py#L566
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run pre-commit
to format and lint the code: https://docs.deepmodeling.com/projects/deepmd/en/master/development/coding-conventions.html#run-scripts-to-check-the-code. Or you can submit from a non-protect branch and pre-commit.ci can do it for you.
Unit tests should be added for two new learning rate classes.
@@ -125,6 +125,7 @@ def gelu_wrapper(x): | |||
"softplus": tf.nn.softplus, | |||
"sigmoid": tf.sigmoid, | |||
"tanh": tf.nn.tanh, | |||
"swish": tf.nn.swish, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that it has been renamed to silu
: tensorflow/tensorflow#41066
) | ||
else: | ||
for fitting_key in self.fitting: | ||
if self.lr_type == "exp": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a good behavior to switch the learning rate in the Trainer
. Instead, implement the method LearningRate.log_start
(LearningRate
should be an abstract base class and inherited by all learning rate classes) and call self.lr.log_start(self.sess)
here.
[Argument("exp", dict, learning_rate_exp())], | ||
[Argument("exp", dict, learning_rate_exp()), | ||
Argument("cos", dict, learning_rate_cos()), | ||
Argument("cosrestart", dict, learning_rate_cosrestarts())], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may need to add some documentation to variants (doc="xxx"
). Otherwise, no one knows what they are.
```python | ||
global_step = min(global_step, decay_steps) | ||
cosine_decay = 0.5 * (1 + cos(pi * global_step / decay_steps)) | ||
decayed = (1 - alpha) * cosine_decay + alpha | ||
decayed_learning_rate = learning_rate * decayed | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use this style: https://numpydoc.readthedocs.io/en/latest/format.html#other-points-to-keep-in-mind
|
||
The function returns the cosine decayed learning rate while taking into account | ||
possible warm restarts. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be removed.
No description provided.