Keeping a Secret Requires a Good Memory: Space Lower-Bounds for Private Algorithms
Alessandro Epasto, Xin Lyu, Pasin Manurangsi
TL;DR
This paper addresses the memory cost of differential privacy in streaming by proving unconditional space lower bounds for user-level DP. It introduces a multi-player communication game that ties the privacy requirement to contribution capping of over-active users, showing that resolving heavy hitters under DP requires memory scaling with the number of such users. Focusing on CountDistinct in a $(w,k)$-occurrency bounded regime, it proves a lower bound $M \ge \widetilde{\Omega}(\frac{kw}{h^2})$ on the space needed for achieving a $(\tau,\eta)$-approximation, and demonstrates an exponential separation from non-private algorithms; the technique also extends to MaxSelect and Quantile. The results imply that privacy can impose fundamental memory costs even for natural statistical estimation tasks, and the proposed information-theoretic framework may apply to a broad class of private streaming problems with memory-accuracy tradeoffs.
Abstract
We study the computational cost of differential privacy in terms of memory efficiency. While the trade-off between accuracy and differential privacy is well-understood, the inherent cost of privacy regarding memory use remains largely unexplored. This paper establishes for the first time an unconditional space lower bound for user-level differential privacy by introducing a novel proof technique based on a multi-player communication game. Central to our approach, this game formally links the hardness of low-memory private algorithms to the necessity of ``contribution capping'' -- tracking and limiting the users who disproportionately impact the dataset. We demonstrate that winning this communication game requires transmitting information proportional to the number of over-active users, which translates directly to memory lower bounds. We apply this framework, as an example, to the fundamental problem of estimating the number of distinct elements in a stream and we prove that any private algorithm requires almost $\widetildeΩ(T^{1/3})$ space to achieve certain error rates in a promise variant of the problem. This resolves an open problem in the literature (by Jain et al. NeurIPS 2023 and Cummings et al. ICML 2025) and establishes the first exponential separation between the space complexity of private algorithms and their non-private $\widetilde{O}(1)$ counterparts for a natural statistical estimation task. Furthermore, we show that this communication-theoretic technique generalizes to broad classes of problems, yielding lower bounds for private medians, quantiles, and max-select.
