Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
Mohammad Alikhani, Reza Kazemi
TL;DR
The paper tackles the challenge of intrusion detection under scarce labeled data in IoT/IIoT environments by proposing a semi-supervised contrastive learning framework built on the Kolmogorov-Arnold Network (KAN). It leverages abundant unlabeled data through a contrastive pretraining stage and then fine-tunes a final KAN layer using a small labeled subset, achieving real-time inference. Empirical results on UNSW-NB15, BoT-IoT, and Gas Pipeline show state-of-the-art performance with low labeling requirements, and the architecture offers interpretability via visualizable spline activations. This approach provides a robust, efficient, and transparent solution for security-critical deployment in industrial settings, with potential for rule extraction and future unsupervised extensions.
Abstract
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.
