-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Value of Epsilon Decay Period #201
Comments
Also, for the JAX Full Rainbow agent (which has Noisy Nets), and when using Noisy Nets, epsilon greedy is disabled (as in paper snippet above, as well as some other implementations like Kaixhin Rainbow here and here). However, I still see the |
thank you for pointing this out! this has been fixed here: ed92c57 |
Thanks! As for
Should the |
In the TF version of DQN, the value of
epsilon_decay_period
is set to 1M steps (see here), and for Rainbow, the value is set to 250k steps (see here).However, the Rainbow paper says they anneal to 4M frames (i.e. 1M steps) for DQN (as done in Dopamine above), and importantly without Noisy Nets (which is the case with TF Rainbow), they anneal in the first 250K frames (and not steps, which would be 62500 steps with standard frame skipping of 4).
Is there a discrepancy here (Rainbow should anneal within 62k steps and not 250k steps), or am I misunderstanding something (or perhaps it really doesn't matter?). Thank you for your time.
Screenshot of page 4 of Rainbow paper
The text was updated successfully, but these errors were encountered: