Table of Contents
Fetching ...

Neurosymbolic Learning for Advanced Persistent Threat Detection under Extreme Class Imbalance

Quhura Fathima, Neda Moghim, Mostafa Taghizade Firouzjaee, Christo K. Thomas, Ross Gore, Walid Saad

TL;DR

A neurosymbolic architecture that integrates an optimized BERT model with logic tensor networks (LTN) for explainable APT detection in wireless IoT networks is proposed and demonstrated to enable high-performance, interpretable, and operationally viable APT detection for IoT network monitoring architectures.

Abstract

The growing deployment of Internet of Things (IoT) devices in smart cities and industrial environments increases vulnerability to stealthy, multi-stage advanced persistent threats (APTs) that exploit wireless communication. Detection is challenging due to severe class imbalance in network traffic, which limits the effectiveness of traditional deep learning approaches and their lack of explainability in classification decisions. To address these challenges, this paper proposes a neurosymbolic architecture that integrates an optimized BERT model with logic tensor networks (LTN) for explainable APT detection in wireless IoT networks. The proposed method addresses the challenges of mobile IoT environments through efficient feature encoding that transforms network flow data into BERT-compatible sequences while preserving temporal dependencies critical for APT stage identification. Severe class imbalance is mitigated using focal loss, hierarchical classification that separates normal traffic detection from attack categorization, and adaptive sampling strategies. Evaluation on the SCVIC-APT2021 dataset demonstrates an operationally viable binary classification F1 score of 95.27% with a false positive rate of 0.14%, and a 76.75% macro F1 score for multi-class attack categorization. Furthermore, a novel explainability analysis statistically validates the importance of distinct network features. These results demonstrate that neurosymbolic learning enables high-performance, interpretable, and operationally viable APT detection for IoT network monitoring architectures.

Neurosymbolic Learning for Advanced Persistent Threat Detection under Extreme Class Imbalance

TL;DR

A neurosymbolic architecture that integrates an optimized BERT model with logic tensor networks (LTN) for explainable APT detection in wireless IoT networks is proposed and demonstrated to enable high-performance, interpretable, and operationally viable APT detection for IoT network monitoring architectures.

Abstract

The growing deployment of Internet of Things (IoT) devices in smart cities and industrial environments increases vulnerability to stealthy, multi-stage advanced persistent threats (APTs) that exploit wireless communication. Detection is challenging due to severe class imbalance in network traffic, which limits the effectiveness of traditional deep learning approaches and their lack of explainability in classification decisions. To address these challenges, this paper proposes a neurosymbolic architecture that integrates an optimized BERT model with logic tensor networks (LTN) for explainable APT detection in wireless IoT networks. The proposed method addresses the challenges of mobile IoT environments through efficient feature encoding that transforms network flow data into BERT-compatible sequences while preserving temporal dependencies critical for APT stage identification. Severe class imbalance is mitigated using focal loss, hierarchical classification that separates normal traffic detection from attack categorization, and adaptive sampling strategies. Evaluation on the SCVIC-APT2021 dataset demonstrates an operationally viable binary classification F1 score of 95.27% with a false positive rate of 0.14%, and a 76.75% macro F1 score for multi-class attack categorization. Furthermore, a novel explainability analysis statistically validates the importance of distinct network features. These results demonstrate that neurosymbolic learning enables high-performance, interpretable, and operationally viable APT detection for IoT network monitoring architectures.
Paper Structure (18 sections, 5 equations, 4 figures, 8 tables)

This paper contains 18 sections, 5 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: System architecture showing parallel BERT and LTN pipelines, hierarchical classification heads, and multi-objective training integration.
  • Figure 2: Six-class confusion matrix at optimal threshold 0.98
  • Figure 3: Feature importance distributions showing clear separation between attack and normal traffic patterns.
  • Figure 4: Statistical significance analysis ($-log_{10}(p$-$value)$) with effect size (ES) for each feature.