A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security

Osvaldo Arreche; Mustafa Abdallah

A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security

Osvaldo Arreche, Mustafa Abdallah

TL;DR

This paper tackles the interpretability gap in neural network–based network intrusion detection by deploying a white-box XAI framework that uses LRP, Integrated Gradients, and DeepLift to generate explanations. It evaluates these methods with six metrics—$Descriptive Accuracy$, $Sparsity$, $Stability$, $Robustness$, $Efficiency$, and $Completeness$—across three datasets: NSL-KDD, CICIDS-2017, and RoEduNet-SIMARGL2021, showing that white-box approaches generally yield robust and complete explanations and often outperform black-box baselines. The authors provide an end-to-end pipeline, detailed metric algorithms, and open-source code to enable reproducibility and community extension. The work highlights practical considerations for deploying XAI in real-time IDS and suggests directions for improving robustness and efficiency, while offering a valuable benchmark and methodology for future research in explainable security analytics.

Abstract

New research focuses on creating artificial intelligence (AI) solutions for network intrusion detection systems (NIDS), drawing its inspiration from the ever-growing number of intrusions on networked systems, increasing its complexity and intelligibility. Hence, the use of explainable AI (XAI) techniques in real-world intrusion detection systems comes from the requirement to comprehend and elucidate black-box AI models to security analysts. In an effort to meet such requirements, this paper focuses on applying and evaluating White-Box XAI techniques (particularly LRP, IG, and DeepLift) for NIDS via an end-to-end framework for neural network models, using three widely used network intrusion datasets (NSL-KDD, CICIDS-2017, and RoEduNet-SIMARGL2021), assessing its global and local scopes, and examining six distinct assessment measures (descriptive accuracy, sparsity, stability, robustness, efficiency, and completeness). We also compare the performance of white-box XAI methods with black-box XAI methods. The results show that using White-box XAI techniques scores high in robustness and completeness, which are crucial metrics for IDS. Moreover, the source codes for the programs developed for our XAI evaluation framework are available to be improved and used by the research community.

A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security

TL;DR

, and

—across three datasets: NSL-KDD, CICIDS-2017, and RoEduNet-SIMARGL2021, showing that white-box approaches generally yield robust and complete explanations and often outperform black-box baselines. The authors provide an end-to-end pipeline, detailed metric algorithms, and open-source code to enable reproducibility and community extension. The work highlights practical considerations for deploying XAI in real-time IDS and suggests directions for improving robustness and efficiency, while offering a valuable benchmark and methodology for future research in explainable security analytics.

Abstract

Paper Structure (34 sections, 8 figures, 14 tables)

This paper contains 34 sections, 8 figures, 14 tables.

Introduction
Related Work
The Problem Statement
Network Intrusion Types
Intrusion Detection Systems
Black-box AI Models and its caveats
Main Categories of Explainable AI (XAI)
Benefits of XAI for Network IDS
Challenges of XAI for IDS and Need for Evaluating XAI
Framework
Overview of the XAI Evaluation Framework
In-Depth XAI Evaluation Pipeline Components
Step-by-step Algorithms for Generating XAI Evaluation Metrics
Top Intrusion Features List and Usage in Evaluating XAI
Application of Features for White-box XAI
...and 19 more sections

Figures (8)

Figure 1: A diagram of the XAI framework for evaluation of network intrusion detection. It considers six evaluation metrics, three white-box XAI methods, a neural network AI model, and three invaluable intrusion datasets.
Figure 2: The Descriptive Accuracy experiment using DeepLift, IG, and LRP white-box XAI methods. The graph displays the accuracy declining as the important intrusion features are removed in the x-axis. It demonstrates the methods’ effectiveness in global explainability in the three datasets.
Figure 3: The XAI techniques Sparsity plots considering LRP, IG, and DeepLift for the used datasets. The outcomes display comparable performance for the datasets. However, in the CICIDS-2017 case, IG and LRP show best performance.
Figure 4: An illustration of a DoS instance from the CICIDS-2017 dataset, considering the Robustness experiment using DeepLift. In (a), the feature list (with flow duration as the top feature) under a biased explanation is displayed. In (b), the list (with the engineered feature as the top feature) after the adversarial model's classification is exhibited.
Figure 5: The percentage of data samples for which biased and unrelated features appear in top-3 features (according to DeepLift rankings of feature importance) for the biased classifier (in (a)) and adversarial classifier (in (b), (c) and (d)) that uses one uncorrelated feature for each dataset. Note that (c) displays the best result. It barely suffers the influence of the unrelated column while displaying the Biased Feature in the third position.
...and 3 more figures

A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security

TL;DR

Abstract

A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security

Authors

TL;DR

Abstract

Table of Contents

Figures (8)