Worst-Case Soft Actor-Critic (WCSAC) for Safety-Constrained Reinforcement Learning: A PyTorch Implementation

This is a PyTorch implementation of Worst-Case Soft Actor-Critic (WCSAC) [Page] [PDF]. You can find the official TensorFlow implementation of WCSAC here.

This repository is built around PyTorch SAC by Denis Yarats and Ilya Kostrikov.

Results on `Safexp-PointGoal1-v0`

SAC-Lagrangian vs WCSAC-0.1

Requirements

We assume you have access to a GPU that can run CUDA 11.2. We also assume that you have Safety-Gym (GitHub, Blog) and Mujoco-Py 2.0.2.7 installed.

conda env create -f conda_env.yml
source activate pytorch_wcsac

_{Note: Please do not raise any issues for problems related to Safety-Gym or Mujoco-Py installation.}

Instructions

To train a WCSAC-0.1 agent on the Safexp-PointGoal1-v0 environment from Safety-Gym run:

python train.py env=point_goal_1 risk_level=0.1 seed=123

This will produce exp folder, where all the outputs are going to be stored including train/eval logs, tensorboard blobs, and evaluation episode videos. One can attach tensorboard to monitor training by running:

tensorboard --logdir exp

Also, see config/agent/wcsac.yaml for full list of hyperparameters. Note that env should have this format: {Robot}_{Goal}_{Level-of-Difficulty}.

Acknowledgement

This repository was built as part of the final project of our master's deep learning laboratory course. We sincerely appreciate the authors of the WCSAC paper, whose original implementation and answers to issues we raised played an important role in meeting our goals.

If you use this code in your research project please cite us and the original authors as:

@misc{pytorch_wcsac,
  author = {Pfrang, Luca and Chandra, Akshay L and Koribille, Sri Harsha},
  title = {Worst-Case Soft Actor-Critic (WCSAC) for Safety-Constrained Reinforcement Learning: A PyTorch Implementation},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/acl21/wcsac}},
}

@article{Yang_Simão_Tindemans_Spaan_2021,
  author={Yang, Qisong and Simão, Thiago D. and Tindemans, Simon H and Spaan, Matthijs T. J.},
  title={WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning},
  year={2021},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
  volume={35},
  url={https://ojs.aaai.org/index.php/AAAI/article/view/17272}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
agent		agent
config		config
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yml		conda_env.yml
logger.py		logger.py
replay_buffer.py		replay_buffer.py
train.py		train.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Worst-Case Soft Actor-Critic (WCSAC) for Safety-Constrained Reinforcement Learning: A PyTorch Implementation

Results on `Safexp-PointGoal1-v0`

SAC-Lagrangian vs WCSAC-0.1

Requirements

Instructions

Acknowledgement

About

Releases

Packages

Languages

License

acl21/wcsac

Folders and files

Latest commit

History

Repository files navigation

Worst-Case Soft Actor-Critic (WCSAC) for Safety-Constrained Reinforcement Learning: A PyTorch Implementation

Results on Safexp-PointGoal1-v0

SAC-Lagrangian vs WCSAC-0.1

Requirements

Instructions

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Results on `Safexp-PointGoal1-v0`

Packages