Gradient Accumulation with Dual (optimizer, scheduler) Training #14999
Unanswered
celsofranssa
asked this question in
code help: NLP / ASR / TTS
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, Lightning community,
I am using a dual (optimizer, scheduler) training as shown in the code snippet below:
With
"frequency": 1
on both optimizers, the trainer callsoptimizer_1
in stepi
while calling optimizer_2 in step(i+1)
.Therefore, is there an approach to combine
gradient acccumulation
with this optimization setup whereoptimizer_1
uses the accumulated gradient from steps(i-1)
andi
whileoptimizer_2
uses the accumulated gradient from stepsi
and(i+ 1)
?Beta Was this translation helpful? Give feedback.
All reactions