Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach

Shan Ali; Feifei Niu; Paria Shirani; Lionel C. Briand

Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach

Shan Ali, Feifei Niu, Paria Shirani, Lionel C. Briand

TL;DR

This paper tackles unknown attack detection in IoT networks by introducing SiamXBERT, a data-efficient and payload-free Siamese meta-learning framework that leverages dual-modality flow and packet features and a SecBERT backbone to learn a transferable similarity space. Through binary meta-tasks, triplet learning, and threshold calibration, SiamXBERT detects unseen attacks with few labeled examples and open-set inference, while robustly handling encrypted traffic. Empirical results on CICIoT2023 and IoT-23 show strong within-dataset performance and notable cross-dataset generalization, including substantial improvements in unknown attack F1-scores compared to SOTA baselines. The work demonstrates practical impact for real-world IoT security by reducing data requirements, avoiding payload inspection, and providing open-set capabilities that cope with distribution shifts and unseen threats.

Abstract

The rapid evolution of cyberattacks continues to drive the emergence of unknown (zero-day) threats, posing significant challenges for network intrusion detection systems in Internet of Things (IoT) networks. Existing machine learning and deep learning approaches typically rely on large labeled datasets, payload inspection, or closed-set classification, limiting their effectiveness under data scarcity, encrypted traffic, and distribution shifts. Consequently, detecting unknown attacks in realistic IoT deployments remains difficult. To address these limitations, we propose SiamXBERT, a robust and data-efficient Siamese meta-learning framework empowered by a transformer-based language model for unknown attack detection. The proposed approach constructs a dual-modality feature representation by integrating flow-level and packet-level information, enabling richer behavioral modeling while remaining compatible with encrypted traffic. Through meta-learning, the model rapidly adapts to new attack types using only a small number of labeled samples and generalizes to previously unseen behaviors. Extensive experiments on representative IoT intrusion datasets demonstrate that SiamXBERT consistently outperforms state-of-the-art baselines under both within-dataset and cross-dataset settings while requiring significantly less training data, achieving up to \num{78.8}\% improvement in unknown F1-score. These results highlight the practicality of SiamXBERT for robust unknown attack detection in real-world IoT environments.

Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach

TL;DR

Abstract

Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (1)