Table of Contents
Fetching ...

Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data

Mohammad Alikhani, Reza Kazemi

TL;DR

The paper tackles the challenge of intrusion detection under scarce labeled data in IoT/IIoT environments by proposing a semi-supervised contrastive learning framework built on the Kolmogorov-Arnold Network (KAN). It leverages abundant unlabeled data through a contrastive pretraining stage and then fine-tunes a final KAN layer using a small labeled subset, achieving real-time inference. Empirical results on UNSW-NB15, BoT-IoT, and Gas Pipeline show state-of-the-art performance with low labeling requirements, and the architecture offers interpretability via visualizable spline activations. This approach provides a robust, efficient, and transparent solution for security-critical deployment in industrial settings, with potential for rule extraction and future unsupervised extensions.

Abstract

In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.

Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data

TL;DR

The paper tackles the challenge of intrusion detection under scarce labeled data in IoT/IIoT environments by proposing a semi-supervised contrastive learning framework built on the Kolmogorov-Arnold Network (KAN). It leverages abundant unlabeled data through a contrastive pretraining stage and then fine-tunes a final KAN layer using a small labeled subset, achieving real-time inference. Empirical results on UNSW-NB15, BoT-IoT, and Gas Pipeline show state-of-the-art performance with low labeling requirements, and the architecture offers interpretability via visualizable spline activations. This approach provides a robust, efficient, and transparent solution for security-critical deployment in industrial settings, with potential for rule extraction and future unsupervised extensions.

Abstract

In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.

Paper Structure

This paper contains 19 sections, 20 equations, 5 figures, 12 tables, 2 algorithms.

Figures (5)

  • Figure 1: Visualization of a stack of $L$ KAN layers, with input dimension of $T$ and output dimension of $C$, denoting the number of classes baravsin2024exploring.
  • Figure 2: The contrastive pre-training framework. The data loading part involves, removing the unnecessary columns, encoding categorical features, handling NaN values, and sampling a batch of data for contrastive learning.
  • Figure 3: t-SNE visualization of the feature extractor outputs, before pre-training.
  • Figure 4: t-SNE visualization of the feature extractor outputs, after pre-training.
  • Figure 5: Visualization of the spline-based learned activation functions, aggregated with SiLU, for the UNSW-NB15 dataset. The red points represent the control points referenced in \ref{['eq:spline_basis_function']}. The observed upward trend in the curves results from the addition of the SiLU function to the spline components, as described in \ref{['eq:silu_spline']}.