Skip to content

AI based Intruction Detection System for the Cybersecurity AI Big Data Challenge

Notifications You must be signed in to change notification settings

cv-lee/IDS-BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Based Intrusion Detection System (IDS)

  • Implementation of an NLP-based Intrusion Detection System (IDS) for binary classification of detected attack packets.
  • This task won 1st place (과기정통부 장관상) in the Cybersecurity AI Big Data Challenge (Nov 2022).

📋 Task

The primary task is to classify intrusion detection system (IDS) results into attack packet or non-attack packet, using a binary classification approach.

🤖 Model

  • Base Model: RoBERTa (SecureBERT)
    • Fine-tuned on IDS-related binary classification data.
    • Leverages pre-trained language model capabilities for analyzing attack packet data.

📊 Dataset

  • Intrusion Detection System Dataset
    • Contains labeled samples for binary classification.
    • Size: N million samples.
    • Includes features extracted from network traffic packets:
    'PAYLOAD', 'APP_PROTO', 'SRC_PORT', 'DST_PORT', 'IMPACT', 'RISK', 'JUDGEMENT', 'Method', 'Method-URL', 'HTTP', 'Host', 'User-Agent', 'Accept', 'Accept-Encoding', 'Accept-Language', 'Accept-Charset', 'Content-Type', 'Content-Length', 'Connection', 'Cookie', 'Upgrade-Insecure-Requests', 'Pragma', 'Cache-Control', 'Body'

📂 Repository Structure

IDS-BERT/
├── ckpt/                 
│   ├── pretrained/              
│   └── trained/                 
├── dataset/                    
├── data_preprocess.py        
├── train.py    
├── inference.py          
├── utils.py                 
├── main.ipynb                       
└── README.md                   

📚 Requirements

  • Python: 3.9+
  • CUDA: 11.7+ (for GPU-based training and inference)
  • For a complete list of dependencies, see requirements.txt.

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/cv-lee/IDS-BERT.git

2. Prepare the Dataset & Pretrained Model

  • Place your dataset files in the dataset/ folder.
  • Place your pretrained files in the ckpt/pretrained folder.
  • Open the main.ipynb file.
  • Execute the data preprocessing step:
    python3 data_preprocess.py

3. Train the Model

  • Train the RoBERTa model using the preprocessed dataset and pretrained model:
    python3 train.py

4. Run Inference

  • Use the trained model for binary classification:
    python3 inference.py

📄 Configuration

{
  "max_seq_length": 512,
  "batch_size": 16,
  "learning_rate": 1e-5,
  "weight_decay": 0.01,
  "num_epochs": 10,
  "device": "cuda"
}

📬 Contact

For any questions or issues, please contact:

About

AI based Intruction Detection System for the Cybersecurity AI Big Data Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published