You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the Neuron.train() method uses the Neuron's error to calculate the delta (used to calculate the gradient and update the weights). This works OK for toy networks with a single output Nueuron. However, the back propagated delta may in fact need to be the derivative of the Neuron's input with respect to the total network error for that training sample, not with respect to the Neuron's particular error. This will represent the weight change needed to affect the total network error, not just that Neuron's error.
Bottom of this page explains it well (now that I know how to work with derivatives, thanks Khan academy!).
The text was updated successfully, but these errors were encountered:
The rule we were using was the Delta Rule, which calculated deltas differently than regular backpropagation. We've switched to basic backpropagation in #92. We are now calculating the delta as output - targetOuput. The issue was as noted in the linked article in this issue. output - targetOutput is not the error, it is the derivative of the network's error function (cost function) with respect to the input vector.
In order to use various network error functions, we need to also have their derivatives. Just like our activation functions are pairs of normal/prime. Then in training, the Neuron needs to use the derivative of the error function when calculating its delta.
Changing this issue to support normal and prime error functions.
levithomason
changed the title
Neuron train error may be incorrect
Support normal and derivative network error functions
Dec 28, 2015
Currently the Neuron.train() method uses the Neuron's error to calculate the delta (used to calculate the gradient and update the weights). This works OK for toy networks with a single output Nueuron. However, the back propagated delta may in fact need to be the derivative of the Neuron's input with respect to the total network error for that training sample, not with respect to the Neuron's particular error. This will represent the weight change needed to affect the total network error, not just that Neuron's error.
Bottom of this page explains it well (now that I know how to work with derivatives, thanks Khan academy!).
The text was updated successfully, but these errors were encountered: