Skip to content

A streamlined implementation of OpenAI: SpinningUp - An educational resource to help anyone learn deep reinforcement learning.

Notifications You must be signed in to change notification settings

jchiwai/spinningup-simplified

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spinning Up in Deep RL (Simplified)

A streamlined implementation of Spinning Up using Pytorch (CPU) that retains the core reinforcement learning algorithms while removing logging functionalities and MPI dependencies. Also works on Windows! (SpinningUp doesn't seem to support Windows)

Algorithms

  • Vanilla Policy Gradient (VPG)
  • Proximal Policy Optimization (PPO)
  • Deep Deterministic Policy Gradient (DDPG)
  • Twin Delayed Deep Deterministic Policy Gradient (TD3)
  • Soft Actor-Critic (SAC)

Usage

Each algorithm is contained within its own Jupyter Notebook. The notebooks are structured as follows:

  • Imports
  • Helper functions
  • Model
  • Buffer
  • Experiment
  • Visualization

The parameters are generally consistent with the original implementation, so these notebooks can be run with minimal adjustments to produce results similar to the original setup benchmark (atleast on HalfCheetah-v4)

Requirements

Requirements are minimal:

  • Python 3.8.20 (+ itertools, copy)
  • Torch 2.4.1
  • Scipy 1.10.1
  • Gymnasium 0.29.1
  • MuJoCo 3.1.6
  • Numpy 1.24.3
  • Pillow 10.4.0

Example GIFs (visualized after 250,000 TotalEnvInteracts)

Algorithm HalfCheetah-v4
VPG vpg GIF
PPO ppo GIF
DPG ddpg GIF
TD3 td3 GIF
SAC sac GIF

About

A streamlined implementation of OpenAI: SpinningUp - An educational resource to help anyone learn deep reinforcement learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published