Fair-Count-Min: Frequency Estimation under Equal Group-wise Approximation Factor
Nima Shahbazi, Stavros Sintos, Abolfazl Asudeh
TL;DR
This paper tackles fairness in streaming frequency estimation by addressing the additive bias of Count-Min sketches, which disproportionately harms low-frequency elements. It introduces Fair-Count-Min (FCM), a group-fair sketch that guarantees equal expected multiplicative approximation factors across predefined element groups via a group-aware semi-uniform hashing and column-partitioning scheme. The authors provide rigorous fairness proofs, analyze the price of fairness (PoF), and develop exact and practical algorithms to compute optimal bucket allocations per group, showing that fairness incurs negligible (often negative for d=1) additive error and retains CM’s space and time efficiency. Empirical evaluation on real and synthetic datasets confirms that FCM achieves group fairness across diverse settings with minimal overhead, offering a practical, theoretically-grounded solution for fair frequency estimation in streaming contexts.
Abstract
Frequency estimation in streaming data often relies on sketches like Count-Min (CM) to provide approximate answers with sublinear space. However, CM sketches introduce additive errors that disproportionately impact low-frequency elements, creating fairness concerns across different groups of elements. We introduce Fair-Count-Min, a frequency estimation sketch that guarantees equal expected approximation factors across element groups, thus addressing the unfairness issue. We propose a column partitioning approach with group-aware semi-uniform hashing to eliminate collisions between elements from different groups. We provide theoretical guarantees for fairness, analyze the price of fairness, and validate our theoretical findings through extensive experiments on real-world and synthetic datasets. Our experimental results show that Fair-Count-Min achieves fairness with minimal additional error and maintains competitive efficiency compared to standard CM sketches.
