Table of Contents
Fetching ...

SAFE: Self-Supervised Anomaly Detection Framework for Intrusion Detection

Elvin Li, Zhengli Shang, Onat Gungor, Tajana Rosing

TL;DR

IoT intrusion detection requires robust methods that can learn from unlabeled data and generalize to unseen attacks. SAFE tackles this by transforming tabular network data into image-like inputs, training a Masked Autoencoder to learn normal-behavior representations, and applying a Local Outlier Factor on the latent features for anomaly detection, aided by PCA feature selection and DeepInsight-based mapping. The framework achieves state-of-the-art F1-scores across four diverse datasets, with up to 26.2% improvements over SLAD and 23.5% over Anomal-E, while maintaining low inference overhead. This work provides a practical, scalable SSL-based IDS pipeline for IoT/IIoT environments with strong generalization and real-time potential.

Abstract

The proliferation of IoT devices has significantly increased network vulnerabilities, creating an urgent need for effective Intrusion Detection Systems (IDS). Machine Learning-based IDS (ML-IDS) offer advanced detection capabilities but rely on labeled attack data, which limits their ability to identify unknown threats. Self-Supervised Learning (SSL) presents a promising solution by using only normal data to detect patterns and anomalies. This paper introduces SAFE, a novel framework that transforms tabular network intrusion data into an image-like format, enabling Masked Autoencoders (MAEs) to learn robust representations of network behavior. The features extracted by the MAEs are then incorporated into a lightweight novelty detector, enhancing the effectiveness of anomaly detection. Experimental results demonstrate that SAFE outperforms the state-of-the-art anomaly detection method, Scale Learning-based Deep Anomaly Detection method (SLAD), by up to 26.2% and surpasses the state-of-the-art SSL-based network intrusion detection approach, Anomal-E, by up to 23.5% in F1-score.

SAFE: Self-Supervised Anomaly Detection Framework for Intrusion Detection

TL;DR

IoT intrusion detection requires robust methods that can learn from unlabeled data and generalize to unseen attacks. SAFE tackles this by transforming tabular network data into image-like inputs, training a Masked Autoencoder to learn normal-behavior representations, and applying a Local Outlier Factor on the latent features for anomaly detection, aided by PCA feature selection and DeepInsight-based mapping. The framework achieves state-of-the-art F1-scores across four diverse datasets, with up to 26.2% improvements over SLAD and 23.5% over Anomal-E, while maintaining low inference overhead. This work provides a practical, scalable SSL-based IDS pipeline for IoT/IIoT environments with strong generalization and real-time potential.

Abstract

The proliferation of IoT devices has significantly increased network vulnerabilities, creating an urgent need for effective Intrusion Detection Systems (IDS). Machine Learning-based IDS (ML-IDS) offer advanced detection capabilities but rely on labeled attack data, which limits their ability to identify unknown threats. Self-Supervised Learning (SSL) presents a promising solution by using only normal data to detect patterns and anomalies. This paper introduces SAFE, a novel framework that transforms tabular network intrusion data into an image-like format, enabling Masked Autoencoders (MAEs) to learn robust representations of network behavior. The features extracted by the MAEs are then incorporated into a lightweight novelty detector, enhancing the effectiveness of anomaly detection. Experimental results demonstrate that SAFE outperforms the state-of-the-art anomaly detection method, Scale Learning-based Deep Anomaly Detection method (SLAD), by up to 26.2% and surpasses the state-of-the-art SSL-based network intrusion detection approach, Anomal-E, by up to 23.5% in F1-score.

Paper Structure

This paper contains 17 sections, 3 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: State-of-the-art supervised ML intrusion detection performance on known and unknown attacks
  • Figure 2: The proposed framework (SAFE) overview. Our methodology begins with feature selection and mapping each vector data point to an image matrix. This mapped data is subsequently used to train an MAE. The encoder head of the MAE extracts latent features, which are then provided to the novelty detector, used to distinguish between attack and normal instances.
  • Figure 3: Visual representation of the masking operation. The MAE is trained to learn how to reconstruct the image when a random portion of the pixels are set to value $0$.
  • Figure 4: Selected image matrices of normal data (top) and attack data (bottom) from X-IIoTID. Even upon superficial examination, the image matrices of normal data exhibit visual homogeneity. In contrast, the attack data manifests in various forms, resulting in transformations that intuitively appear distinct.
  • Figure 5: State-of-the-art Precision and Recall Comparison