Name		Name	Last commit message	Last commit date
parent directory ..
experiments/cartpole		experiments/cartpole
README.md		README.md
advantage.go		advantage.go
agent.go		agent.go
agent_test.go		agent_test.go
loss.go		loss.go
memory.go		memory.go
policy.go		policy.go
policy_test.go		policy_test.go

README.md

Proximal Policy Optimization

In Progress ⚠️ blocked on gorgonia/gorgonia#373

Implementation of the Proximal Policy Optimization algorithm.

How it works

PPO is an on-policy method that aims to solve the step size issue with policy gradients. Typically policy gradient algorithms are very sensitive to step size, too large a step and the agent can fall into an unrecoverable state, to small a size and the agent takes a very long time to train. PPO solves this issue by ensuring that an agents policy never deviates too far from the previous policy.

A ratio is taken of the old policy to the new policy and the delta is clipped to ensure policy changes remain within a bounds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ppo

ppo

README.md

Proximal Policy Optimization

How it works

Examples

Roadmap

References

Files

ppo

Directory actions

More options

Directory actions

More options

Latest commit

History

ppo

Folders and files

parent directory

README.md

Proximal Policy Optimization

How it works

Examples

Roadmap

References