Hierarchical Local-Global Feature Learning for Few-shot Malicious Traffic Detection
Songtao Peng, Lei Wang, Wu Shuai, Hao Song, Jiajun Zhou, Shanqing Yu, Qi Xuan
TL;DR
This work tackles the challenge of detecting malicious traffic under few-shot conditions by introducing HLoG, a hierarchical framework that learns both local phase-level and global session representations. Sessions are segmented into phases with a sliding window, encoded via multi-layer Bi-GRUs to yield local features, which are then globally modeled using a separate Bi-GRU. A session similarity network fuses cosine-based local similarity with self-attention-enhanced global features to produce similarity scores, enabling a similarity-based, few-shot classifier trained with a mean-squared error objective. The authors reconstruct three few-shot datasets (CIC-IDS-FS, TON-IoT-FS, IDS-FS) and demonstrate that HLoG achieves state-of-the-art performance, with notably high recall and substantially reduced false positives across binary and multi-class tasks. This approach offers practical impact for real-world cybersecurity where labeled data are scarce and rapid adaptation to new attack types is essential, and it lays a foundation for applying hierarchical local-global feature learning to related anomaly-detection problems.
Abstract
With the rapid growth of internet traffic, malicious network attacks have become increasingly frequent and sophisticated, posing significant threats to global cybersecurity. Traditional detection methods, including rule-based and machine learning-based approaches, struggle to accurately identify emerging threats, particularly in scenarios with limited samples. While recent advances in few-shot learning have partially addressed the data scarcity issue, existing methods still exhibit high false positive rates and lack the capability to effectively capture crucial local traffic patterns. In this paper, we propose HLoG, a novel hierarchical few-shot malicious traffic detection framework that leverages both local and global features extracted from network sessions. HLoG employs a sliding-window approach to segment sessions into phases, capturing fine-grained local interaction patterns through hierarchical bidirectional GRU encoding, while simultaneously modeling global contextual dependencies. We further design a session similarity assessment module that integrates local similarity with global self-attention-enhanced representations, achieving accurate and robust few-shot traffic classification. Comprehensive experiments on three meticulously reconstructed datasets demonstrate that HLoG significantly outperforms existing state-of-the-art methods. Particularly, HLoG achieves superior recall rates while substantially reducing false positives, highlighting its effectiveness and practical value in real-world cybersecurity applications.
