MPO

PyTorch Implementation of the Maximum a Posteriori Policy Optimisation (paper1, paper2) Reinforcement Learning Algorithms for OpenAI gym environments.

How to Run

I tested on the below environment.

Windows 10
Python 3.7
PyTorch 1.8.1

INSTALL

Install PyTorch https://pytorch.org/

pip install gym Box2D IPython tqdm scipy tensorboard tensorboardx

Continuous Action Space

python train.py \
  --device cuda:0 \
  --env LunarLanderContinuous-v2 \
  --log log_continuous \
  --render

Discrete Action Space

python train.py \
  --device cuda:0 \
  --env LunarLander-v2 \
  --log log_discrete \
  --render

License

This repository is a clone of theogruner/rl_pro_telu, which is licensed under the GNU GPL3 License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.vscode		.vscode
baseline		baseline
log_continuous		log_continuous
mpo		mpo
MPO reading note.md		MPO reading note.md
README.md		README.md
test_continuous.sh		test_continuous.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MPO

How to Run

INSTALL

Continuous Action Space

Discrete Action Space

License

About

Releases

Packages

Languages

xander-2077/mpo

Folders and files

Latest commit

History

Repository files navigation

MPO

How to Run

INSTALL

Continuous Action Space

Discrete Action Space

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages