Skip to content

Modified the existing code for the pendulum example to allow for arbitrary choice of the width and depth of the neural network controller.

License

Notifications You must be signed in to change notification settings

CR-Richardson/IQCbased_ImitationLearning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IQCbased_ImitationLearning

This code is a modified version of the original code written to accompany the paper Imitation Learning with Stability and Safety Guarantees. Like the original code, it learns a neural network controller with stability and safety guarantees through imitation learning.

Modification

The original code from the inverted pendulum example was generalised to allow for experimentation with an arbitrary number of hidden layers with arbitrary width. To this end, the scripts NN_policy.py and solve_sdp.m were modified. Some additional plotting scripts were also uploaded for comparing the ROA when using various neural network controllers.

Original Authors:

  • He Yin (he_yin at berkeley.edu)
  • Peter Seiler (pseiler at umich.edu)
  • Ming Jin (jinming at vt.edu)
  • Murat Arcak (arcak at berkeley.edu)

Author of modified code:

  • Carl R Richardson (cr2g16 at soton.ac.uk)

Getting Started

The code is written in Python3 and MATLAB.

Prerequisites

There are several packages required:

  • MOSEK: Commercial semidefinite programming solver
  • CVX: MATLAB Software for Convex Programming
  • Tensorflow: Open source machine learning platform

To plot the computed ROA, three more packages are required:

  • SOSOPT: General SOS optimization utility
  • Multipoly: Package used to represent multivariate polynomials
  • MPT3: Matlab based Multi-Parametric Toolbox for parametric optimization, computational geometry and model predictive control.

Way of Using the Code

  • To start the safe imitation learing process run NN_policy.py. The number of iterations, gradient descent steps, network size, and other parameters are defined in the main function.
  • The computation results are stored in the folder data.
  • To visualize the results, run result_analysis.m.
  • To visualize a comparison between two different neural network controllers, run result_analysis_ROA.m.
  • In both visualisation files, the directories of the results, iteration numbers and legend labels can be modified at the top of the scripts.

About

Modified the existing code for the pendulum example to allow for arbitrary choice of the width and depth of the neural network controller.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 59.8%
  • Python 40.2%