Overview

Mateen is an ensemble framework designed to enhance AutoEncoder (AE)-based one-class network intrusion detection systems by effectively managing distribution shifts in network traffic. It comprises four key components:

Shift Detection Function

Purpose: Detects distribution shifts in network traffic using statistical methods.

Sample Selection

Subset Selection: Identifies a representative subset of the network traffic samples that reflects the overall distribution after a shift.
Labeling and Update Decision: The subset is manually labeled to decide whether an update to the ensemble is necessary.

Shift Adaptation Module

Incremental Model Update: Integrates the benign data of the labeled subset with the existing training set. Then, updates the incremental model on this expanded set.
Temporary Model Training: Initiates a new temporary model with the same weights as the incremental model. Then, train this model exclusively on the benign data of the labeled subset.

Complexity Reduction Module

Model Merging: Merges temporary models that perform similarly.
Model Pruning: Removes models that underperform compared to the best-performing model.

For further details, please refer to the main paper.

Pre-requisites and requirements

Ensure the following dependencies are installed before running Mateen. You can install them using the command below:

pip install -r requirements.txt

Contents of 'requirements.txt':

torch==2.0.1
numpy==1.25.0
pandas==1.5.3
scipy==1.10.1
sklearn==1.2.2
tqdm==4.65.0

Models and Data

You can download the pre-trained models, the processed data, as well as the results CSV files from the following link:

Google Drive Folder

The contents of the folder are as follows:

Datasets.zip: Contains the processed data.
Models.zip: Contains the pre-trained models.
Results.zip: Prediction results and probabilities across datasets.

Ensure these files are placed in the Mateen/ directory after downloading and extracting.

How to Use Mateen

To utilize Mateen with our settings, please follow these steps to set up the required datasets and run the framework.

Dataset Setup

First, download the datasets as mentioned in the Models and Data section. Ensure that the files are organized in the following directories:

Datasets/CICIDS2017/ for IDS2017
Datasets/IDS2018/ for IDS2018
Datasets/Kitsune/ for Kitsune and its variants.

You can directly download and unzip the datasets into the main directory of Mateen (i.e., Mateen/).

Running Mateen

To run Mateen, use the following command:

python Mateen.py

Command-Line Options

You can customize the execution using various command-line options:

Dataset Selection

Switch between datasets using the '--dataset_name' option.

Example:

python Mateen.py --dataset_name "IDS2017"

Options

"IDS2017", "IDS2018", "Kitsune", "mKitsune", and "rKitsune"

Window Size

Set the window size using the '--window_size' option.

Example:

python Mateen.py --dataset_name "IDS2017" --window_size 50000

Options

10000, 50000, and 100000

Shift Detection Threshold

Set the threshold using '--shift_threshold' option.

Example:

python Mateen.py  --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05

Options

0.05, 0.1, and 0.2

Performance Threshold

The minimum acceptable performance '--performance_thres' option.

Example:

python Mateen.py --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99

Options

0.99, 0.95, 0.90, 0.85, and 0.8

Maximum Ensemble Size

The maximum acceptable ensemble size '--max_ensemble_length' option.

Example:

python Mateen.py --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99 --max_ensemble_length 3

Options

3, 5, and 7

Selection Rate

Set the selection rate for building a subset for manual labeling using the '--selection_budget' option.

Example:

python Mateen.py  --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99 --max_ensemble_length 3 --selection_budget 0.01

Options

0.005, 0.01, 0.05, and 0.1

Mini Batch Size for Sample Selection

Choose the min-batch size using the '--mini_batch_size' option.

Example:

python Mateen.py --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99 --max_ensemble_length 3 --selection_budget 0.01 --mini_batch_size 1000

Options

500, 1000, and 1500

Retention Rate

Set the value of the retention rate using '--retention_rate' option.

Example:

python Mateen.py --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99 --max_ensemble_length 3 --selection_budget 0.01 --mini_batch_size 1000 --retention_rate 0.3

Options

0.3, 0.5, and 0.9

Lambda 0 value

Adjust the lambda_0 parameter with the '--lambda_0' option to adjust the weight assigned to uniqueness scores during the sample selection process.

Example:

python Mateen.py  --dataset_name "IDS2017" --window_size 50000 --shift_threshold 0.05 --performance_thres 0.99 --max_ensemble_length 3 --selection_budget 0.01 --mini_batch_size 1000 --retention_rate 0.3 --lambda_0 0.1

Options

0.1, 0.5, and 1.0

Hyperparameter Selection

For further details about the hyperparameter selection, please refer to the main paper, Appendix C.

Citation

@inproceedings{alotaibi24mateen,
  title={Mateen: Adaptive Ensemble Learning for Network Anomaly Detection},
  author={Alotaibi, Fahad and Maffeis, Sergio},
  booktitle={the 27th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2024)},
  year={2024},
  organization={Association for Computing Machinery}
}

Contact

If you have any questions or need further assistance, please feel free to reach out to me at any time:

Email: [email protected]
Alternate Email: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
MateenUtils		MateenUtils
Mateen.py		Mateen.py
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Shift Detection Function

Sample Selection

Shift Adaptation Module

Complexity Reduction Module

Pre-requisites and requirements

Models and Data

How to Use Mateen

Dataset Setup

Running Mateen

Command-Line Options

Dataset Selection

Window Size

Shift Detection Threshold

Performance Threshold

Maximum Ensemble Size

Selection Rate

Mini Batch Size for Sample Selection

Retention Rate

Lambda 0 value

Hyperparameter Selection

Citation

Contact

About

Releases

Packages

Languages

ICL-ml4csec/Mateen

Folders and files

Latest commit

History

Repository files navigation

Overview

Shift Detection Function

Sample Selection

Shift Adaptation Module

Complexity Reduction Module

Pre-requisites and requirements

Models and Data

How to Use Mateen

Dataset Setup

Running Mateen

Command-Line Options

Dataset Selection

Window Size

Shift Detection Threshold

Performance Threshold

Maximum Ensemble Size

Selection Rate

Mini Batch Size for Sample Selection

Retention Rate

Lambda 0 value

Hyperparameter Selection

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages