PyTorch implementation of D4PG

This repository contains a PyTorch implementation of D4PG with IQN as the improved distributional Critic instead of C51. Also the extentions Munchausen RL and D2RL are added and can be combined with D4PG as needed.

Dependencies

Trained and tested on:

Python 3.6
PyTorch 1.4.0  
Numpy 1.15.2 
gym 0.10.11

How to use:

The new script combines all extensions and the add-ons can be simply added by setting the corresponding flags.

python run.py -info your_run_info

Parameter: To see the options: python run.py -h

Observe training results

tensorboard --logdir=runs

Added Extensions:

Prioritized Experience Replay [X]
N-Step Bootstrapping [X]
D2RL [X]
Distributional IQN Critic [X]
Munchausen RL [X]
Parallel-Environments [X]

Results

Environment: Pendulum

Below you can see how IQN reduced the variance of the Critic loss:

Environment: LunarLander

Notes:

Performance depends a lot on good hyperparameter->> tau for Per bigger (pendulum 1e-2) for regular replay (1e-3)
BatchNorm had good impact on the overall performance (!)

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
imgs		imgs
scripts		scripts
README.md		README.md
enjoy.py		enjoy.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch implementation of D4PG

Dependencies

How to use:

Observe training results

Results

Environment: Pendulum

Environment: LunarLander

About

Releases

Packages

Languages

BY571/D4PG

Folders and files

Latest commit

History

Repository files navigation

PyTorch implementation of D4PG

Dependencies

How to use:

Observe training results

Results

Environment: Pendulum

Environment: LunarLander

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages