Pytorch version of Wolpertinger Training for cache environment with DDPG.
The code is compatible with training in multi-GPU, single-GPU or CPU.
- python 3.6.8
- torch 1.1.0
- gym 0.14.0
- pyflann
- This is the library (FLANN, Muja & Lowe, 2014) with approximate nearest-neighbor methods allowed for logarithmic-time lookup complexity relative to the number of actions. However, the python binding of FLANN (pyflann) is written for python 2 and is no longer maintained. Please refer to pyflann for the pyflann package compatible with python3. Just download and place it in your (virtual) environment.
- To use CPU only:
$ python main.py --gpu-ids -1
- To use single-GPU only:
$ python main.py --gpu-ids 0 --gpu-nums 1
- To use multi-GPU (e.g., use GPU-0 and GPU-1):
$ python main.py --gpu-ids 0 1 --gpu-nums 2
- You can set your experiment parameters in the
arg_parser.py
- The
train_test.py
is used for the baseline experiment. - The
train_test_window.py
is used for the window experiment.
- Original paper of Wolpertinger Training with DDPG, Google DeepMind
- I used and modified part of the code in https://github.com/ghliu/pytorch-ddpg under Apache License 2.0.
- I used and modified part of the code in https://github.com/jimkon/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces under MIT License.