2020-10-07 added support for Tensorflow 2.3.1
PPO and PPO_CNN agents playing Pong-v0 game:
2020-10-10 added LunarLander-v2_PPO Continuous code for Tensorflow 2.3.1:
2020-10-23 added BipedalWalker-v3_PPO code for Tensorflow 2.3.1:
-
Epsilon Greedy Dueling Double Deep Q Learning tutorial (D3QN)
-
Introduction to RL Asynchronous Advanced Actor Critic algorythm (A3C)
-
Introduction to RL Proximal Policy Optimization algorythm (PPO)
-
Let’s code from scratch a discrete Reinforcement Learning rocket landing agent! (PPO)
-
Continuous Proximal Policy Optimization Tutorial with OpenAI gym environment! (PPO)
PPO Pong-v0 Learning curve: