This code is a modified version of the original code written to accompany the paper Imitation Learning with Stability and Safety Guarantees. Like the original code, it learns a neural network controller with stability and safety guarantees through imitation learning.
The original code from the inverted pendulum example was generalised to allow for experimentation with an arbitrary number of hidden layers with arbitrary width. To this end, the scripts NN_policy.py
and solve_sdp.m
were modified. Some additional plotting scripts were also uploaded for comparing the ROA when using various neural network controllers.
- He Yin (he_yin at berkeley.edu)
- Peter Seiler (pseiler at umich.edu)
- Ming Jin (jinming at vt.edu)
- Murat Arcak (arcak at berkeley.edu)
- Carl R Richardson (cr2g16 at soton.ac.uk)
The code is written in Python3 and MATLAB.
There are several packages required:
- MOSEK: Commercial semidefinite programming solver
- CVX: MATLAB Software for Convex Programming
- Tensorflow: Open source machine learning platform
To plot the computed ROA, three more packages are required:
- SOSOPT: General SOS optimization utility
- Multipoly: Package used to represent multivariate polynomials
- MPT3: Matlab based Multi-Parametric Toolbox for parametric optimization, computational geometry and model predictive control.
- To start the safe imitation learing process run
NN_policy.py
. The number of iterations, gradient descent steps, network size, and other parameters are defined in the main function. - The computation results are stored in the folder data.
- To visualize the results, run
result_analysis.m
. - To visualize a comparison between two different neural network controllers, run
result_analysis_ROA.m
. - In both visualisation files, the directories of the results, iteration numbers and legend labels can be modified at the top of the scripts.