A Flow is a Stream of Packets: A Stream-Structured Data Approach for DDoS Detection
Raja Giryes, Lior Shafir, Avishai Wool
TL;DR
This work tackles the challenge of timely and accurate DDoS detection by rethinking flow representation. It treats flows as variable-length streams of packet headers and classifies them with a Set-Tree model that supports permutation-invariant, set-based splits and an attention mechanism, enabling effective early detection from a handful of packets. The approach achieves near-perfect accuracy on CICDDoS2019 and strong performance on CICIDS2017, with substantial time savings when using only the first 2–4 packets, and it uses only 4–6% of traffic data. The method offers practical benefits in speed, interpretability, and payload-free detection, making it suitable for real-time network defense.
Abstract
Distributed Denial of Service (DDoS) attacks are getting increasingly harmful to the Internet, showing no signs of slowing down. Developing an accurate detection mechanism to thwart DDoS attacks is still a big challenge due to the rich variety of these attacks and the emergence of new attack vectors. In this paper, we propose a new tree-based DDoS detection approach that operates on a flow as a stream structure, rather than the traditional fixed-size record structure containing aggregated flow statistics. Although aggregated flow records have gained popularity over the past decade, providing an effective means for flow-based intrusion detection by inspecting only a fraction of the total traffic volume, they are inherently constrained. Their detection precision is limited not only by the lack of packet payloads, but also by their structure, which is unable to model fine-grained inter-packet relations, such as packet order and temporal relations. Additionally, inferring aggregated flow statistics must wait for the complete flow to end. Here we show that considering flow inputs as variable-length streams composed of their associated packet headers, allows for very accurate and fast detection of malicious flows. We evaluate our proposed strategy on the CICDDoS2019 and CICIDS2017 datasets, which contain a comprehensive variety of DDoS attacks. Our approach matches or exceeds existing machine learning techniques' accuracy, including state-of-the-art deep learning methods. Furthermore, our method achieves significantly earlier detection, e.g., with CICDDoS2019 detection based on the first 2 packets, which corresponds to an average time-saving of 99.79% and uses only 4--6% of the traffic volume.
