Distributed Incast Detection in Data Center Networks
Yiming Zheng, Haoran Qi, Lirui Yu, Zhan Shu, Qing Zhao
TL;DR
The paper addresses incast in data center networks and the limitations of queue-threshold detectors. It introduces DIDIE, a distributed switch-level incast detector that uses a sequential hypothesis test to identify incast from the first arriving packet, enabling fast per-flow decisions. By modeling regular and incast traffic with separate inter-arrival distributions and deriving an optimal inter-arrival threshold $\epsilon$ through ROC-based linear-cost optimization, the method achieves accurate detection with minimal delay and learns key parameters via EWMA. ns-3 experiments show significant improvements in detection speed and accuracy over queue-length baselines, including 0% false positives in real-world traffic, highlighting the method’s practical potential for incast-aware pacing and congestion control.
Abstract
Incast traffic in data centers can lead to severe performance degradation, such as packet loss and increased latency. Effectively addressing incast requires prompt and accurate detection. Existing solutions, including MA-ECN, BurstRadar and Pulser, typically rely on fixed thresholds of switch port egress queue lengths or their gradients to identify microburst caused by incast flows. However, these queue length related methods often suffer from delayed detection and high error rates. In this study, we propose a distributed incast detection method for data center networks at the switch-level, leveraging a probabilistic hypothesis test with an optimal detection threshold. By analyzing the arrival intervals of new flows, our algorithm can immediately determine if a flow is part of an incast traffic from its initial packet. The experimental results demonstrate that our method offers significant improvements over existing approaches in both detection speed and inference accuracy.
