Counter Pools: Counter Representation for Efficient Stream Processing
Ran Ben Basat, Gil Einziger, Bilal Tyah, Shay Vargaftik
TL;DR
Counter Pools address the memory bottleneck of counter arrays in stream processing by introducing fixed-size pools that hold multiple variable-sized counters. The approach hinges on a stars-and-bars encoding to map per-pool counter sizes to a compact configuration number, enabling efficient dynamic resizing and, when needed, graceful pool-failure handling. The paper demonstrates strong improvements in space-accuracy tradeoffs for sketches and enables faster exact histogram counting by reducing load factors, supported by a thorough evaluation against state-of-the-art methods. This technique is particularly impactful for heavy-tailed network workloads, offering substantial memory savings and accuracy improvements with practical deployment considerations.
Abstract
Due to the large data volume and number of distinct elements, space is often the bottleneck of many stream processing systems. The data structures used by these systems often consist of counters whose optimization yields significant memory savings. The challenge lies in balancing the size of the counters: too small, and they overflow; too large, and memory capacity limits their number. In this work, we suggest an efficient encoding scheme that sizes each counter according to its needs. Our approach uses fixed-sized pools of memory (e.g., a single memory word or 64 bits), where each pool manages a small number of counters. We pay special attention to performance and demonstrate considerable improvements for various streaming algorithms and workload characteristics.
