Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

Sileshi Nibret Zeleke; Amsalu Fentie Jember; Mario Bochicchio

Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

Sileshi Nibret Zeleke, Amsalu Fentie Jember, Mario Bochicchio

TL;DR

This work tackles detecting malware in encrypted network traffic without decrypting payloads by integrating explainable AI with ensemble tree methods and multi-view feature extraction. A multi-source dataset of 1,127 malware connections across 54 families is used alongside normal traffic to train and evaluate models, with XGBoost achieving top performance (over 99% across metrics). SHAP-based explanations provide global and local insights, identifying key features such as maximum packet size, mean inter-arrival time, and TLS version as drivers of decisions, thereby enhancing transparency and trust. The approach demonstrates strong detection capability in encrypted environments and emphasizes interpretability to support cybersecurity analysts and policy automation.

Abstract

Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints. However, attackers are increasingly exploiting encryption to conceal malicious behavior. Detecting unknown encrypted malicious traffic without decrypting the payloads remains a significant challenge. In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic. We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication. To effectively represent malicious communication, we compiled a robust dataset with 1,127 unique connections, more than any other available open-source dataset, and spanning 54 malware families. Our models were benchmarked against the CTU-13 dataset, achieving performance of over 99% accuracy, precision, and F1-score. Additionally, the eXtreme Gradient Boosting (XGB) model demonstrated 99.32% accuracy, 99.53% precision, and 99.43% F1-score on our custom dataset. By leveraging Shapley Additive Explanations (SHAP), we identified that the maximum packet size, mean inter-arrival time of packets, and transport layer security version used are the most critical features for the global model explanation. Furthermore, key features were identified as important for local explanations across both datasets for individual traffic samples. These insights provide a deeper understanding of the model decision-making process, enhancing the transparency and reliability of detecting malicious encrypted traffic.

Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

TL;DR

Abstract

Paper Structure (21 sections, 1 equation, 7 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 1 equation, 7 figures, 1 table, 1 algorithm.

Introduction
Related Work
Methodology
Data Preparation
Flow Construction and Feature Extraction
Detection Model
Random Forest:
Extreme Gradient Boosting:
Extremely Randomized Trees:
Evaluation Metrics
Explainability
Experimental Settings
Experimental Result and Discussions
Detection Model Performance
Explaining Detection Model
...and 6 more sections

Figures (7)

Figure 1: Overview of the proposed explainable malware detection pipeline
Figure 2: Radar plot for performance comparison of the proposed malware detection across different datasets and metrics: (a) CTU-13; (b) Our dataset; (c) Our imbalanced dataset
Figure 3: Confusion matrix comparison on: (a) CTU-13; (b) Our imbalanced dataset
Figure 4: Summary plot of global explanation of XGBoost model trained using our dataset
Figure 5: Summary plot of global explanation of XGBoost model trained using CTU-13 dataset
...and 2 more figures

Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

TL;DR

Abstract

Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic

Authors

TL;DR

Abstract

Table of Contents

Figures (7)