Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use non-constant alpha #19

Open
xanderdunn opened this issue Aug 25, 2015 · 0 comments
Open

Use non-constant alpha #19

xanderdunn opened this issue Aug 25, 2015 · 0 comments
Assignees
Milestone

Comments

@xanderdunn
Copy link
Owner

From the FAQ:
For batch training, standard backprop usually converges (eventually) to a local minimum, if one exists. For incremental training, standard backprop does not converge to a stationary point of the error surface. To obtain convergence, the learning rate must be slowly reduced. This methodology is called "stochastic approximation" or "annealing".

Try slowly reducing your neural net's alpha. Ideas. More on annealing

Try other methods of dynamically setting alpha:

This paper provides some methods for on-line adaptive alpha.
LeCun's paper proposes a Hessian method.

With incremental training, it is much more difficult to concoct an algorithm that automatically adjusts the learning rate during training. Various proposals have appeared in the NN literature, but most of them don't work. Problems with some of these proposals are illustrated by Darken and Moody (1992), who unfortunately do not offer a solution. Some promising results are provided by by LeCun, Simard, and Pearlmutter (1993), and by Orr and Leen (1997), who adapt the momentum rather than the learning rate. There is also a variant of stochastic approximation called "iterate averaging" or "Polyak averaging" (Kushner and Yin 1997), which theoretically provides optimal convergence rates by keeping a running average of the weight values. I have no personal experience with these methods; if you have any solid evidence that these or other methods of automatically setting the learning rate and/or momentum in incremental training actually work in a wide variety of NN applications, please inform the FAQ maintainer ([email protected]).

@xanderdunn xanderdunn self-assigned this Aug 25, 2015
@xanderdunn xanderdunn modified the milestone: 1.0 Aug 26, 2015
@xanderdunn xanderdunn added the P1 label Aug 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant