A Novel Solution for Zero-Day Attack Detection in IDS using Self-Attention and Jensen-Shannon Divergence in WGAN-GP

Ziyu Mu; Xiyu Shi; Safak Dogan

A Novel Solution for Zero-Day Attack Detection in IDS using Self-Attention and Jensen-Shannon Divergence in WGAN-GP

Ziyu Mu, Xiyu Shi, Safak Dogan

Abstract

The increasing sophistication of cyber threats, especially zero-day attacks, poses a significant challenge to cybersecurity. Zero-day attacks exploit unknown vulnerabilities, making them difficult to detect and defend against. Existing approaches patch flaws and deploy an Intrusion Detection System (IDS). Using advanced Wasserstein GANs with Gradient Penalty (WGAN-GP), this paper makes a novel proposition to synthesize network traffic that mimics zero-day patterns, enriching data diversity and improving IDS generalization. SA-WGAN-GP is first introduced, which adds a Self-Attention (SA) mechanism to capture long-range cross-feature dependencies by reshaping the feature vector into tokens after dense projections. A JS-WGAN-GP is then proposed, which adds a Jensen-Shannon (JS) divergence-based auxiliary discriminator that is trained with Binary Cross-Entropy (BCE), frozen during updates, and used to regularize the generator for smoother gradients and higher sample quality. Third, SA-JS-WGAN-GP is created by combining the SA mechanism with JS divergence, thereby enhancing the data generation ability of WGAN-GP. As data augmentation does not equate with true zero-day attack discovery, we emulate zero-day attacks via the leave-one-attack-type-out method on the NSL-KDD dataset for training all GANs and IDS models in the assessment of the effectiveness of the proposed solution. The evaluation results show that integrating SA and JS divergence into WGAN-GP yields superior IDS performance and more effective zero-day risk detection.

A Novel Solution for Zero-Day Attack Detection in IDS using Self-Attention and Jensen-Shannon Divergence in WGAN-GP

Abstract

Paper Structure (20 sections, 25 equations, 5 figures, 10 tables, 1 algorithm)

This paper contains 20 sections, 25 equations, 5 figures, 10 tables, 1 algorithm.

Introduction
Related Work
Proposed Methodology
From WGAN to WGAN-GP
The SA-WGAN-GP Model
The JS-WGAN-GP Model
The SA-JS-WGAN-GP Model
Experimental Setup
Dataset For the Proposed WGAN-GP Models
IDS Models
Synthesizing Dataset for IDS
Results And Discussion
Evaluation Metrics
Binary Classification
Multi-Classification
...and 5 more sections

Figures (5)

Figure 1: The flowchart (left) shows the workflow of the enhanced performance evaluation process for IDS models, while the right side illustrates each step involved.
Figure 2: The structure of the proposed SA-JS-WGAN-GP. Loss_G, Loss_D, and Loss_C represent the loss values of G, D, and C, respectively.
Figure 3: Distribution of data types of the training and testing dataset in the NSL-KDD dataset. (a) The original training dataset contains a significant proportion of Normal (53.46%) and DOS (36.46%) data, while other attack types, like R2L (0.8%) and U2R (0.04%), are rare. (b) Synthetic data has been generated and added to the binary classification training dataset, balancing the attack types. U2R now has a higher proportion (4.18%) compared to the original dataset. (c) The original NSL-KDD test dataset has proportions similar to the original training set. (d) Synthetic data has been generated and added to the multi-class training set, showing a more balanced distribution of attack types, likely helping the model better recognize each category.
Figure 4: Binary classification average accuracy of various IDS models with different WGAN-GP variants. (a) Average accuracy of IDS models across different WGAN-GP variants. (b) Average accuracy of WGAN-GP variants across different IDS models.
Figure 5: Multi-classification average accuracy of various IDS models with different WGAN-GP variants. (a) Average accuracy of IDS models across different WGAN-GP variants; (b) Average accuracy of WGAN-GP variants across different IDS models

A Novel Solution for Zero-Day Attack Detection in IDS using Self-Attention and Jensen-Shannon Divergence in WGAN-GP

Abstract

A Novel Solution for Zero-Day Attack Detection in IDS using Self-Attention and Jensen-Shannon Divergence in WGAN-GP

Authors

Abstract

Table of Contents

Figures (5)