Actor Critic with PPO

For intuitive guide to the mechanics of actor-critic methods check out accompanying comic.

Notebook designed for readability and exploration rather than production. Uses a single GPU. For an industrial-strength PPO in PyTorch check out ikostrikov's. For the 'definitive' implementation of PPO, check out OpenAI baselines (tensorflow). For outstanding resources on RL check out OpenAI's Spinning Up