2FA Sketch: Two-Factor Armor Sketch for Accurate and Efficient Heavy Hitter Detection in Data Streams
Xilai Liu, Xinyi Zhang, Bingqing Liu, Tao Li, Tong Yang, Gaogang Xie
TL;DR
The paper tackles heavy hitter detection in high-rate data streams under tight memory by proposing the Two-Factor Armor (2FA) Sketch, a dual-layer sketch architecture. It combines an improved in-bucket Arbitration strategy with a cross-bucket conflict avoidance hashing scheme, supported by a theoretically derived optimal $λ$ and a redesigned $vote^+_{new}$ as a conflict indicator. Experimental results show substantially lower error rates, by $2.5$ to $19.7\times$, and a throughput improvement of about $1.03\times$ over Elastic Sketch, across CAIDA and Zipf-distributed synthetic data, with robust performance under varied memory constraints. The solution advances real-time heavy hitter detection by mitigating recall loss from congestion and providing publicly available source code for reproducibility.
Abstract
Detecting heavy hitters, which are flows exceeding a specified threshold, is crucial for network measurement, but it faces challenges due to increasing throughput and memory constraints. Existing sketch-based solutions, particularly those using Comparative Counter Voting, have limitations in efficiently identifying heavy hitters. This paper introduces the Two-Factor Armor (2FA) Sketch, a novel data structure designed to enhance heavy hitter detection in data streams. 2FA Sketch implements dual-layer protection through an improved $\mathtt{Arbitration}$ strategy for in-bucket competition and a cross-bucket conflict $\mathtt{Avoidance}$ hashing scheme. By theoretically deriving an optimal $λ$ parameter and redesigning $vote^+_{new}$ as a conflict indicator, it optimizes the Comparative Counter Voting strategy. Experimental results show that 2FA Sketch outperforms the standard Elastic Sketch, reducing error rates by 2.5 to 19.7 times and increasing processing speed by 1.03 times.
