Continual Counting with Gradual Privacy Expiration
Joel Daniel Andersson, Monika Henzinger, Rasmus Pagh, Teresa Anna Steiner, Jalaj Upadhyay
TL;DR
This paper studies continual counting under differential privacy with gradual privacy expiration, introducing a flexible expiration function $g$ that models decreasing data sensitivity over time. It develops a pan-private, dyadic-interval based algorithm with four key adaptations to support unbounded streams, achieving an additive error of $O\left(\log T/ε\right)$ for a broad class of $g$, and proves a matching lower bound showing near-tightness. The mechanism runs in amortized $O(1)$ time per update and uses $O(\log T)$ space, with empirical results indicating favorable privacy-utility trade-offs compared to a natural baseline, especially for large delays. The work thus provides tight, scalable, and practical guarantees for continual counting under expiration, with potential impact on private streaming analytics and privacy-preserving federated learning.
Abstract
Differential privacy with gradual expiration models the setting where data items arrive in a stream and at a given time $t$ the privacy loss guaranteed for a data item seen at time $(t-d)$ is $εg(d)$, where $g$ is a monotonically non-decreasing function. We study the fundamental $\textit{continual (binary) counting}$ problem where each data item consists of a bit, and the algorithm needs to output at each time step the sum of all the bits streamed so far. For a stream of length $T$ and privacy $\textit{without}$ expiration continual counting is possible with maximum (over all time steps) additive error $O(\log^2(T)/\varepsilon)$ and the best known lower bound is $Ω(\log(T)/\varepsilon)$; closing this gap is a challenging open problem. We show that the situation is very different for privacy with gradual expiration by giving upper and lower bounds for a large set of expiration functions $g$. Specifically, our algorithm achieves an additive error of $ O(\log(T)/ε)$ for a large set of privacy expiration functions. We also give a lower bound that shows that if $C$ is the additive error of any $ε$-DP algorithm for this problem, then the product of $C$ and the privacy expiration function after $2C$ steps must be $Ω(\log(T)/ε)$. Our algorithm matches this lower bound as its additive error is $O(\log(T)/ε)$, even when $g(2C) = O(1)$. Our empirical evaluation shows that we achieve a slowly growing privacy loss with significantly smaller empirical privacy loss for large values of $d$ than a natural baseline algorithm.
