Spinning Up in Deep RL (Simplified)

A streamlined implementation of Spinning Up using Pytorch (CPU) that retains the core reinforcement learning algorithms while removing logging functionalities and MPI dependencies. Also works on Windows! (SpinningUp doesn't seem to support Windows)

Algorithms

Vanilla Policy Gradient (VPG)
Proximal Policy Optimization (PPO)
Deep Deterministic Policy Gradient (DDPG)
Twin Delayed Deep Deterministic Policy Gradient (TD3)
Soft Actor-Critic (SAC)

Usage

Each algorithm is contained within its own Jupyter Notebook. The notebooks are structured as follows:

Imports
Helper functions
Model
Buffer
Experiment
Visualization

The parameters are generally consistent with the original implementation, so these notebooks can be run with minimal adjustments to produce results similar to the original setup benchmark (atleast on HalfCheetah-v4)

Requirements

Requirements are minimal:

Python 3.8.20 (+ itertools, copy)
Torch 2.4.1
Scipy 1.10.1
Gymnasium 0.29.1
MuJoCo 3.1.6
Numpy 1.24.3
Pillow 10.4.0

Example GIFs (visualized after 250,000 TotalEnvInteracts)

Algorithm	HalfCheetah-v4
VPG
PPO
DPG
TD3
SAC

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
gifs		gifs
README.md		README.md
ddpg.ipynb		ddpg.ipynb
ppo.ipynb		ppo.ipynb
sac.ipynb		sac.ipynb
td3.ipynb		td3.ipynb
vpg.ipynb		vpg.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spinning Up in Deep RL (Simplified)

Algorithms

Usage

Requirements

Example GIFs (visualized after 250,000 TotalEnvInteracts)

About

Releases

Packages

Languages

jchiwai/spinningup-simplified

Folders and files

Latest commit

History

Repository files navigation

Spinning Up in Deep RL (Simplified)

Algorithms

Usage

Requirements

Example GIFs (visualized after 250,000 TotalEnvInteracts)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages