Use non-constant alpha #19

xanderdunn · 2015-08-25T10:11:49Z

From the FAQ:
For batch training, standard backprop usually converges (eventually) to a local minimum, if one exists. For incremental training, standard backprop does not converge to a stationary point of the error surface. To obtain convergence, the learning rate must be slowly reduced. This methodology is called "stochastic approximation" or "annealing".

Try slowly reducing your neural net's alpha. Ideas. More on annealing

Try other methods of dynamically setting alpha:

This paper provides some methods for on-line adaptive alpha.
LeCun's paper proposes a Hessian method.

With incremental training, it is much more difficult to concoct an algorithm that automatically adjusts the learning rate during training. Various proposals have appeared in the NN literature, but most of them don't work. Problems with some of these proposals are illustrated by Darken and Moody (1992), who unfortunately do not offer a solution. Some promising results are provided by by LeCun, Simard, and Pearlmutter (1993), and by Orr and Leen (1997), who adapt the momentum rather than the learning rate. There is also a variant of stochastic approximation called "iterate averaging" or "Polyak averaging" (Kushner and Yin 1997), which theoretically provides optimal convergence rates by keeping a running average of the weight values. I have no personal experience with these methods; if you have any solid evidence that these or other methods of automatically setting the learning rate and/or momentum in incremental training actually work in a wide variety of NN applications, please inform the FAQ maintainer ([email protected]).

xanderdunn added the enhancement label Aug 25, 2015

xanderdunn self-assigned this Aug 25, 2015

xanderdunn modified the milestone: 1.0 Aug 26, 2015

xanderdunn added the P1 label Aug 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use non-constant alpha #19

Use non-constant alpha #19

xanderdunn commented Aug 25, 2015

Use non-constant alpha #19

Use non-constant alpha #19

Comments

xanderdunn commented Aug 25, 2015