diff --git a/01_pytorch_workflow.ipynb b/01_pytorch_workflow.ipynb index a38bc67a..78d95b22 100644 --- a/01_pytorch_workflow.ipynb +++ b/01_pytorch_workflow.ipynb @@ -880,7 +880,7 @@ ">\n", "> And on the ordering of things, the above is a good default order but you may see slightly different orders. Some rules of thumb: \n", "> * Calculate the loss (`loss = ...`) *before* performing backpropagation on it (`loss.backward()`).\n", - "> * Zero gradients (`optimizer.zero_grad()`) *before* stepping them (`optimizer.step()`).\n", + "> * Zero gradients (`optimizer.zero_grad()`) *before* computing the gradients of the loss with respect to every model parameter (`loss.backward()`).\n", "> * Step the optimizer (`optimizer.step()`) *after* performing backpropagation on the loss (`loss.backward()`).\n", "\n", "For resources to help understand what's happening behind the scenes with backpropagation and gradient descent, see the extra-curriculum section.\n" diff --git a/docs/01_pytorch_workflow.ipynb b/docs/01_pytorch_workflow.ipynb index b9176f5e..a1112909 100644 --- a/docs/01_pytorch_workflow.ipynb +++ b/docs/01_pytorch_workflow.ipynb @@ -881,7 +881,7 @@ ">\n", "> And on the ordering of things, the above is a good default order but you may see slightly different orders. Some rules of thumb: \n", "> * Calculate the loss (`loss = ...`) *before* performing backpropagation on it (`loss.backward()`).\n", - "> * Zero gradients (`optimizer.zero_grad()`) *before* stepping them (`optimizer.step()`).\n", + "> * Zero gradients (`optimizer.zero_grad()`) *before* computing the gradients of the loss with respect to every model parameter (`loss.backward()`).\n", "> * Step the optimizer (`optimizer.step()`) *after* performing backpropagation on the loss (`loss.backward()`).\n", "\n", "For resources to help understand what's happening behind the scenes with backpropagation and gradient descent, see the extra-curriculum section.\n"