Private Continual Counting of Unbounded Streams

Ben Jacobsen; Kassem Fawaz

Private Continual Counting of Unbounded Streams

Ben Jacobsen, Kassem Fawaz

TL;DR

This work tackles differentially private continual counting on unbounded data streams by introducing a logarithmically perturbed LTToep matrix factorization that exactly represents the all-ones counting matrix M_count as LR. The authors prove joint validity, bounded sensitivity, and near-optimal asymptotic error for the resulting unbounded streaming algorithm, and provide an efficient, implementable procedure with O(t) space and amortized O(log t) time per update. They also offer practical extensions, including parameter choices, handling imperfect knowledge of n, and hybrid mechanisms that improve constants relative to prior baselines. Empirically, the proposed log_matrix method achieves competitive variance with favorable constants up to n ≈ 2^{24}, while maintaining smooth, unbounded privacy guarantees. This work thus delivers a principled, scalable DP solution for continual counting without requiring known input size in advance.

Abstract

We study the problem of differentially private continual counting in the unbounded setting where the input size $n$ is not known in advance. Current state-of-the-art algorithms based on optimal instantiations of the matrix mechanism cannot be directly applied here because their privacy guarantees only hold when key parameters are tuned to $n$. Using the common `doubling trick' avoids knowledge of $n$ but leads to suboptimal and non-smooth error. We solve this problem by introducing novel matrix factorizations based on logarithmic perturbations of the function $\frac{1}{\sqrt{1-z}}$ studied in prior works, which may be of independent interest. The resulting algorithm has smooth error, and for any $α> 0$ and $t\leq n$ it is able to privately estimate the sum of the first $t$ data points with $O(\log^{2+2α}(t))$ variance. It requires $O(t)$ space and amortized $O(\log t)$ time per round, compared to $O(\log(n)\log(t))$ variance, $O(n)$ space and $O(n \log n)$ pre-processing time for the nearly-optimal bounded-input algorithm of Henzinger et al. (SODA 2023). Empirically, we find that our algorithm's performance is also comparable to theirs in absolute terms: our variance is less than $1.5\times$ theirs for $t$ as large as $2^{24}$.

Private Continual Counting of Unbounded Streams

TL;DR

Abstract

We study the problem of differentially private continual counting in the unbounded setting where the input size

is not known in advance. Current state-of-the-art algorithms based on optimal instantiations of the matrix mechanism cannot be directly applied here because their privacy guarantees only hold when key parameters are tuned to

. Using the common `doubling trick' avoids knowledge of

but leads to suboptimal and non-smooth error. We solve this problem by introducing novel matrix factorizations based on logarithmic perturbations of the function

studied in prior works, which may be of independent interest. The resulting algorithm has smooth error, and for any

and

it is able to privately estimate the sum of the first

data points with

variance. It requires

space and amortized

time per round, compared to

variance,

space and

pre-processing time for the nearly-optimal bounded-input algorithm of Henzinger et al. (SODA 2023). Empirically, we find that our algorithm's performance is also comparable to theirs in absolute terms: our variance is less than

theirs for

as large as

Private Continual Counting of Unbounded Streams

TL;DR

Abstract

Private Continual Counting of Unbounded Streams

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (3)