Table of Contents
Fetching ...

Continual Counting with Gradual Privacy Expiration

Joel Daniel Andersson, Monika Henzinger, Rasmus Pagh, Teresa Anna Steiner, Jalaj Upadhyay

TL;DR

This paper studies continual counting under differential privacy with gradual privacy expiration, introducing a flexible expiration function $g$ that models decreasing data sensitivity over time. It develops a pan-private, dyadic-interval based algorithm with four key adaptations to support unbounded streams, achieving an additive error of $O\left(\log T/ε\right)$ for a broad class of $g$, and proves a matching lower bound showing near-tightness. The mechanism runs in amortized $O(1)$ time per update and uses $O(\log T)$ space, with empirical results indicating favorable privacy-utility trade-offs compared to a natural baseline, especially for large delays. The work thus provides tight, scalable, and practical guarantees for continual counting under expiration, with potential impact on private streaming analytics and privacy-preserving federated learning.

Abstract

Differential privacy with gradual expiration models the setting where data items arrive in a stream and at a given time $t$ the privacy loss guaranteed for a data item seen at time $(t-d)$ is $εg(d)$, where $g$ is a monotonically non-decreasing function. We study the fundamental $\textit{continual (binary) counting}$ problem where each data item consists of a bit, and the algorithm needs to output at each time step the sum of all the bits streamed so far. For a stream of length $T$ and privacy $\textit{without}$ expiration continual counting is possible with maximum (over all time steps) additive error $O(\log^2(T)/\varepsilon)$ and the best known lower bound is $Ω(\log(T)/\varepsilon)$; closing this gap is a challenging open problem. We show that the situation is very different for privacy with gradual expiration by giving upper and lower bounds for a large set of expiration functions $g$. Specifically, our algorithm achieves an additive error of $ O(\log(T)/ε)$ for a large set of privacy expiration functions. We also give a lower bound that shows that if $C$ is the additive error of any $ε$-DP algorithm for this problem, then the product of $C$ and the privacy expiration function after $2C$ steps must be $Ω(\log(T)/ε)$. Our algorithm matches this lower bound as its additive error is $O(\log(T)/ε)$, even when $g(2C) = O(1)$. Our empirical evaluation shows that we achieve a slowly growing privacy loss with significantly smaller empirical privacy loss for large values of $d$ than a natural baseline algorithm.

Continual Counting with Gradual Privacy Expiration

TL;DR

This paper studies continual counting under differential privacy with gradual privacy expiration, introducing a flexible expiration function that models decreasing data sensitivity over time. It develops a pan-private, dyadic-interval based algorithm with four key adaptations to support unbounded streams, achieving an additive error of for a broad class of , and proves a matching lower bound showing near-tightness. The mechanism runs in amortized time per update and uses space, with empirical results indicating favorable privacy-utility trade-offs compared to a natural baseline, especially for large delays. The work thus provides tight, scalable, and practical guarantees for continual counting under expiration, with potential impact on private streaming analytics and privacy-preserving federated learning.

Abstract

Differential privacy with gradual expiration models the setting where data items arrive in a stream and at a given time the privacy loss guaranteed for a data item seen at time is , where is a monotonically non-decreasing function. We study the fundamental problem where each data item consists of a bit, and the algorithm needs to output at each time step the sum of all the bits streamed so far. For a stream of length and privacy expiration continual counting is possible with maximum (over all time steps) additive error and the best known lower bound is ; closing this gap is a challenging open problem. We show that the situation is very different for privacy with gradual expiration by giving upper and lower bounds for a large set of expiration functions . Specifically, our algorithm achieves an additive error of for a large set of privacy expiration functions. We also give a lower bound that shows that if is the additive error of any -DP algorithm for this problem, then the product of and the privacy expiration function after steps must be . Our algorithm matches this lower bound as its additive error is , even when . Our empirical evaluation shows that we achieve a slowly growing privacy loss with significantly smaller empirical privacy loss for large values of than a natural baseline algorithm.
Paper Structure (25 sections, 10 theorems, 36 equations, 5 figures, 1 table, 3 algorithms)

This paper contains 25 sections, 10 theorems, 36 equations, 5 figures, 1 table, 3 algorithms.

Key Result

Theorem 1.2

Let $\lambda\in\mathbb{R}_{>0}\backslash\{\tfrac{3}{2}\}$ be a constant, and let parameters $\varepsilon \in \mathbb{R}_{>0}$ and $B\in\mathbb{N}$ be given. There exists an algorithm $\mathcal{A}$ that approximates prefix sums of a (potentially unbounded) input sequence $x_1, x_2, \dots$ with $x_i \ Considering all releases up to and including input $t$, the algorithm $\mathcal{A}$ uses $O(B+\log

Figures (5)

  • Figure 1: Plots on the privacy loss for our \ref{['alg:privacy_degradation']} and a baseline algorithm.
  • Figure 2: Worst-case privacy loss computed empirically for a data item streamed $d$ steps earlier.
  • Figure 3: Worst-case privacy loss for a data item streamed $d$ steps earlier, shown for Algorithm \ref{['alg:privacy_degradation']} (with $\lambda=1, 2, 3$) versus the baseline ($W=127$ and $W=1023$).
  • Figure 4: Worst-case privacy loss computed empirically for a data item streamed $d$ steps earlier. Figure \ref{['fig:baseline_opt']} is a re-computation of Figure \ref{['fig:baseline']} where the ratio $\varepsilon_{past}/\varepsilon_{cur}$ is set to minimize the maximum privacy loss, yielding a ratio of $0.069$ for $W=31$, $0.08$ for $W=63$ and $0.095$ for $W=127$. Figure \ref{['fig:comparison_baseline_opt']} is a re-computation of Figure \ref{['fig:comparison']} where the ratio $\varepsilon_{past}/\varepsilon_{cur}$ is set to minimize the maximum privacy loss, yielding a ratio of $0.0064$ for $W=127$ and $0.010$ for $W=1023$.
  • Figure :

Theorems & Definitions (20)

  • Definition 1.1
  • Theorem 1.2
  • Corollary 1.3
  • Theorem 1.4
  • Lemma 2.2
  • Lemma 3.1
  • Lemma 3.4
  • Lemma 4.1
  • Theorem 5.1
  • Definition B.1: Laplace Distribution
  • ...and 10 more