Spinning Up in Deep RL (Simplified)

A streamlined implementation of Spinning Up using Pytorch (CPU) that retains the core reinforcement learning algorithms while removing logging functionalities and MPI dependencies. Also works on Windows! (SpinningUp doesn't seem to support Windows)

Algorithms

Vanilla Policy Gradient (VPG)
Proximal Policy Optimization (PPO)
Deep Deterministic Policy Gradient (DDPG)
Twin Delayed Deep Deterministic Policy Gradient (TD3)
Soft Actor-Critic (SAC)

Usage

Each algorithm is contained within its own Jupyter Notebook. The notebooks are structured as follows:

Imports
Helper functions
Model
Buffer
Experiment
Visualization

The parameters are generally consistent with the original implementation, so these notebooks can be run with minimal adjustments to produce results similar to the original setup benchmark (atleast on HalfCheetah-v4)

Requirements

Requirements are minimal:

Python 3.8.20 (+ itertools, copy)
Torch 2.4.1
Scipy 1.10.1
Gymnasium 0.29.1
MuJoCo 3.1.6
Numpy 1.24.3
Pillow 10.4.0

Example GIFs (visualized after 250,000 TotalEnvInteracts)

Algorithm	HalfCheetah-v4
VPG
PPO
DPG
TD3
SAC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Spinning Up in Deep RL (Simplified)

Algorithms

Usage

Requirements

Example GIFs (visualized after 250,000 TotalEnvInteracts)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Spinning Up in Deep RL (Simplified)

Algorithms

Usage

Requirements

Example GIFs (visualized after 250,000 TotalEnvInteracts)