Skip to content

Latest commit

 

History

History
40 lines (34 loc) · 1.59 KB

README.md

File metadata and controls

40 lines (34 loc) · 1.59 KB

Spinning Up in Deep RL (Simplified)

A streamlined implementation of Spinning Up using Pytorch (CPU) that retains the core reinforcement learning algorithms while removing logging functionalities and MPI dependencies. Also works on Windows! (SpinningUp doesn't seem to support Windows)

Algorithms

  • Vanilla Policy Gradient (VPG)
  • Proximal Policy Optimization (PPO)
  • Deep Deterministic Policy Gradient (DDPG)
  • Twin Delayed Deep Deterministic Policy Gradient (TD3)
  • Soft Actor-Critic (SAC)

Usage

Each algorithm is contained within its own Jupyter Notebook. The notebooks are structured as follows:

  • Imports
  • Helper functions
  • Model
  • Buffer
  • Experiment
  • Visualization

The parameters are generally consistent with the original implementation, so these notebooks can be run with minimal adjustments to produce results similar to the original setup benchmark (atleast on HalfCheetah-v4)

Requirements

Requirements are minimal:

  • Python 3.8.20 (+ itertools, copy)
  • Torch 2.4.1
  • Scipy 1.10.1
  • Gymnasium 0.29.1
  • MuJoCo 3.1.6
  • Numpy 1.24.3
  • Pillow 10.4.0

Example GIFs (visualized after 250,000 TotalEnvInteracts)

Algorithm HalfCheetah-v4
VPG vpg GIF
PPO ppo GIF
DPG ddpg GIF
TD3 td3 GIF
SAC sac GIF