diff --git a/01_pytorch_workflow.ipynb b/01_pytorch_workflow.ipynb
index a38bc67a..78d95b22 100644
--- a/01_pytorch_workflow.ipynb
+++ b/01_pytorch_workflow.ipynb
@@ -880,7 +880,7 @@
     ">\n",
     "> And on the ordering of things, the above is a good default order but you may see slightly different orders. Some rules of thumb: \n",
     "> * Calculate the loss (`loss = ...`) *before* performing backpropagation on it (`loss.backward()`).\n",
-    "> * Zero gradients (`optimizer.zero_grad()`) *before* stepping them (`optimizer.step()`).\n",
+    "> * Zero gradients (`optimizer.zero_grad()`) *before* computing the gradients of the loss with respect to every model parameter (`loss.backward()`).\n",
     "> * Step the optimizer (`optimizer.step()`) *after* performing backpropagation on the loss (`loss.backward()`).\n",
     "\n",
     "For resources to help understand what's happening behind the scenes with backpropagation and gradient descent, see the extra-curriculum section.\n"
diff --git a/docs/01_pytorch_workflow.ipynb b/docs/01_pytorch_workflow.ipynb
index b9176f5e..a1112909 100644
--- a/docs/01_pytorch_workflow.ipynb
+++ b/docs/01_pytorch_workflow.ipynb
@@ -881,7 +881,7 @@
     ">\n",
     "> And on the ordering of things, the above is a good default order but you may see slightly different orders. Some rules of thumb: \n",
     "> * Calculate the loss (`loss = ...`) *before* performing backpropagation on it (`loss.backward()`).\n",
-    "> * Zero gradients (`optimizer.zero_grad()`) *before* stepping them (`optimizer.step()`).\n",
+    "> * Zero gradients (`optimizer.zero_grad()`) *before* computing the gradients of the loss with respect to every model parameter (`loss.backward()`).\n",
     "> * Step the optimizer (`optimizer.step()`) *after* performing backpropagation on the loss (`loss.backward()`).\n",
     "\n",
     "For resources to help understand what's happening behind the scenes with backpropagation and gradient descent, see the extra-curriculum section.\n"