Table of Contents
Fetching ...

Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

Dilli Prasad Sharma, Liang Xue, Xiaowei Sun, Xiaodong Lin, Pulei Xiong

TL;DR

This work tackles adversarial vulnerability in IoT intrusion detection by introducing a SHAP-based attribution fingerprinting approach. It extracts feature attribution vectors using SHAP DeepExplainer, then trains a deep autoencoder on clean attribution fingerprints to detect adversarial inputs in an unsupervised manner. Empirical results on the CIC-IoT2023 dataset show the SHAP-based detector achieving higher accuracy, precision, and robustness across FGSM, PGD, and DeepFool attacks, with significantly lower attack success rates than a state-of-the-art adversarially trained baseline. By integrating explainable AI with adversarial detection, the method enhances both robustness and transparency of IoT security defenses, with potential for efficient deployment on resource-constrained environments.

Abstract

The rapid proliferation of Internet of Things (IoT) devices has transformed numerous industries by enabling seamless connectivity and data-driven automation. However, this expansion has also exposed IoT networks to increasingly sophisticated security threats, including adversarial attacks targeting artificial intelligence (AI) and machine learning (ML)-based intrusion detection systems (IDS) to deliberately evade detection, induce misclassification, and systematically undermine the reliability and integrity of security defenses. To address these challenges, we propose a novel adversarial detection model that enhances the robustness of IoT IDS against adversarial attacks through SHapley Additive exPlanations (SHAP)-based fingerprinting. Using SHAP's DeepExplainer, we extract attribution fingerprints from network traffic features, enabling the IDS to reliably distinguish between clean and adversarially perturbed inputs. By capturing subtle attribution patterns, the model becomes more resilient to evasion attempts and adversarial manipulations. We evaluated the model on a standard IoT benchmark dataset, where it significantly outperformed a state-of-the-art method in detecting adversarial attacks. In addition to enhanced robustness, this approach improves model transparency and interpretability, thereby increasing trust in the IDS through explainable AI.

Enhancing Adversarial Robustness of IoT Intrusion Detection via SHAP-Based Attribution Fingerprinting

TL;DR

This work tackles adversarial vulnerability in IoT intrusion detection by introducing a SHAP-based attribution fingerprinting approach. It extracts feature attribution vectors using SHAP DeepExplainer, then trains a deep autoencoder on clean attribution fingerprints to detect adversarial inputs in an unsupervised manner. Empirical results on the CIC-IoT2023 dataset show the SHAP-based detector achieving higher accuracy, precision, and robustness across FGSM, PGD, and DeepFool attacks, with significantly lower attack success rates than a state-of-the-art adversarially trained baseline. By integrating explainable AI with adversarial detection, the method enhances both robustness and transparency of IoT security defenses, with potential for efficient deployment on resource-constrained environments.

Abstract

The rapid proliferation of Internet of Things (IoT) devices has transformed numerous industries by enabling seamless connectivity and data-driven automation. However, this expansion has also exposed IoT networks to increasingly sophisticated security threats, including adversarial attacks targeting artificial intelligence (AI) and machine learning (ML)-based intrusion detection systems (IDS) to deliberately evade detection, induce misclassification, and systematically undermine the reliability and integrity of security defenses. To address these challenges, we propose a novel adversarial detection model that enhances the robustness of IoT IDS against adversarial attacks through SHapley Additive exPlanations (SHAP)-based fingerprinting. Using SHAP's DeepExplainer, we extract attribution fingerprints from network traffic features, enabling the IDS to reliably distinguish between clean and adversarially perturbed inputs. By capturing subtle attribution patterns, the model becomes more resilient to evasion attempts and adversarial manipulations. We evaluated the model on a standard IoT benchmark dataset, where it significantly outperformed a state-of-the-art method in detecting adversarial attacks. In addition to enhanced robustness, this approach improves model transparency and interpretability, thereby increasing trust in the IDS through explainable AI.

Paper Structure

This paper contains 27 sections, 20 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Attribution Fingerprinting Generation.
  • Figure 2: Comparing Top-10 Features of Clean and Attack Samples Based on SHAP-Attribution.
  • Figure 3: Comparison of adversarial detection performance of our proposed SHAP-based model against the adversarially trained model with different attack scenarios: (a–b) FGSM, (c–d) PGD, and (e–f) DeepFool.
  • Figure 4: Comparing Reconstruction Error Distribution of Clean and Adversarial Test Samples with Threshold $\tau=0.02$.