Skip to content

Anomaly Detection in OT datasets through machine learning (AE/VAE/PCA)

Notifications You must be signed in to change notification settings

Aegrah/AE-VAE-PCA-Anomaly_Detection-ML

Repository files navigation

AE, VAE and PCA anomaly detection in operational technology infrastructure datasets

This repository contains the Jupyter Notebooks and Python code that were used while writing my Master Thesis paper on "Anomaly Detection in Operational Technology Infrastructures Using Artificial Neural Networks". The models are implemented using Tensorflow and Keras. My thesis paper looked at the following models:

  • Auto-Encoders (AE)
  • Variational Auto-Encoders (VAE)
  • Principal Component Analysis (PCA)

And concluded that the original Auto-Encoder model performed best with regards to anomaly detection in operational technology infrastructure datasets.

The datasets that were used to analyze the models during this research are:

  • Water Distribution (WADI)
  • Secure Water Treatment (SWaT)
  • Battle of Attack Detection Algorithms (BATADAL)

These datasets can be requested at iTrust, Centre for Research in Cyber Security, which is available through the following URL: https://itrust.sutd.edu.sg/itrust-labs_datasets/dataset_info/

The creditcard fraud dataset was used to initially test the different models in order to ensure that they worked correctly. This dataset is available at: https://www.kaggle.com/mlg-ulb/creditcardfraud

Although this repository contains almost all code used for my research, I didn't go through testing all of it before pushing it to GitHub. The reason for this is that I wrote my thesis two years ago and I currently do not have an up-to-date machine learning development environment. Most of the code should still be working and can be used as a starting point to develop your own models.

About

Anomaly Detection in OT datasets through machine learning (AE/VAE/PCA)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published