A streamlined implementation of Spinning Up using Pytorch (CPU) that retains the core reinforcement learning algorithms while removing logging functionalities and MPI dependencies. Also works on Windows! (SpinningUp doesn't seem to support Windows)
- Vanilla Policy Gradient (VPG)
- Proximal Policy Optimization (PPO)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed Deep Deterministic Policy Gradient (TD3)
- Soft Actor-Critic (SAC)
Each algorithm is contained within its own Jupyter Notebook. The notebooks are structured as follows:
- Imports
- Helper functions
- Model
- Buffer
- Experiment
- Visualization
The parameters are generally consistent with the original implementation, so these notebooks can be run with minimal adjustments to produce results similar to the original setup benchmark (atleast on HalfCheetah-v4)
Requirements are minimal:
- Python
3.8.20
(+itertools
,copy
) - Torch
2.4.1
- Scipy
1.10.1
- Gymnasium
0.29.1
- MuJoCo
3.1.6
- Numpy
1.24.3
- Pillow
10.4.0
Algorithm | HalfCheetah-v4 |
---|---|
VPG | |
PPO | |
DPG | |
TD3 | |
SAC |