Multidomain transformer-based deep learning for early detection of network intrusion
Jinxin Liu, Murat Simsek, Michele Nogueira, Burak Kantarci
TL;DR
The paper tackles the latency inherent in traditional NIDS by enabling intrusion detection from partial network flows. It introduces TS-NFM to cast per-flow packets as a multivariate time series and MDT to fuse time- and frequency-domain cues via a 2D FFT-enhanced Transformer with MD-MHA, yielding fast, early detections. Key contributions include the SCVIC-TS-2022 dataset, the TS-NFM feature extractor, and the MDT framework that achieves macro F1 of $84.1\%$ on SCVIC-TS-2022, with notable improvements over Transformer baselines and across ECG and Wafer datasets. This approach demonstrates the potential for substantially earlier and still accurate intrusion responses, with practical impact for real-time network defense and broader time-series intrusion detection tasks.
Abstract
Timely response of Network Intrusion Detection Systems (NIDS) is constrained by the flow generation process which requires accumulation of network packets. This paper introduces Multivariate Time Series (MTS) early detection into NIDS to identify malicious flows prior to their arrival at target systems. With this in mind, we first propose a novel feature extractor, Time Series Network Flow Meter (TS-NFM), that represents network flow as MTS with explainable features, and a new benchmark dataset is created using TS-NFM and the meta-data of CICIDS2017, called SCVIC-TS-2022. Additionally, a new deep learning-based early detection model called Multi-Domain Transformer (MDT) is proposed, which incorporates the frequency domain into Transformer. This work further proposes a Multi-Domain Multi-Head Attention (MD-MHA) mechanism to improve the ability of MDT to extract better features. Based on the experimental results, the proposed methodology improves the earliness of the conventional NIDS (i.e., percentage of packets that are used for classification) by 5x10^4 times and duration-based earliness (i.e., percentage of duration of the classified packets of a flow) by a factor of 60, resulting in a 84.1% macro F1 score (31% higher than Transformer) on SCVIC-TS-2022. Additionally, the proposed MDT outperforms the state-of-the-art early detection methods by 5% and 6% on ECG and Wafer datasets, respectively.
