Skip to content

Riashat/Policy-Gradient-Reinforcement-Learning

Repository files navigation

THis repository contains code for Policy Gradient Methods in Reinforcement Learning

Islam R., Lever G., Shawe-Taylor J., Improving Convergence of Deterministic Policy Gradient Methods in Reinforcement Learning. 2015

  1. Stochastic Policy Gradients
  2. Deterministic Policy Gradients

This repo contains code for actor-critic policy gradient methods in reinforcement learning (using least-squares temporal differnece learning with a linear function approximator) Contains code for:

The algorithms we consider include:

  1. Episodic REINFORCE (Monte-Carlo) Actor-Critic Stochastic Policy Gradient
  2. Stochastic Off-Policy Actor-Critic Policy Gradient
  3. Deterministic Policy Gradients
  4. Deterministic Gradients with Stochastic Exploration
  5. Natural Stochastic Policy Gradients
  6. Natural Deterministic Policy Gradients
  7. Deterministic Gradients with Adaptive Step Size Gradient Ascent
  8. Deterministic Gradients with Momentum-Based Nesterov's Accelerated Gradient
  9. Stochastic Gradients with Momentum-Based Nesterov's Accelerated Gradient

We consider the following MDPs using a Parameterized Controller (Agent):

  1. Toy MDP
  2. Grid World (10x10) MDP
  3. Mountain Car MDP
  4. Cart Pole MDP
  5. Pendulum MDP

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published