Table of Contents
Fetching ...

Private Counting of Distinct Elements in the Turnstile Model and Extensions

Monika Henzinger, A. R. Sricharan, Teresa Anna Steiner

TL;DR

This work shows that a very simple algorithm based on the sparse vector technique achieves a tight additive error for item-level $\epsilon,\delta)$-differential privacy and item-level $\epsilon$-differential privacy with regards to a different parameterization, namely the sum of all flippancies.

Abstract

Privately counting distinct elements in a stream is a fundamental data analysis problem with many applications in machine learning. In the turnstile model, Jain et al. [NeurIPS2023] initiated the study of this problem parameterized by the maximum flippancy of any element, i.e., the number of times that the count of an element changes from 0 to above 0 or vice versa. They give an item-level $(ε,δ)$-differentially private algorithm whose additive error is tight with respect to that parameterization. In this work, we show that a very simple algorithm based on the sparse vector technique achieves a tight additive error for item-level $(ε,δ)$-differential privacy and item-level $ε$-differential privacy with regards to a different parameterization, namely the sum of all flippancies. Our second result is a bound which shows that for a large class of algorithms, including all existing differentially private algorithms for this problem, the lower bound from item-level differential privacy extends to event-level differential privacy. This partially answers an open question by Jain et al. [NeurIPS2023].

Private Counting of Distinct Elements in the Turnstile Model and Extensions

TL;DR

This work shows that a very simple algorithm based on the sparse vector technique achieves a tight additive error for item-level -differential privacy and item-level -differential privacy with regards to a different parameterization, namely the sum of all flippancies.

Abstract

Privately counting distinct elements in a stream is a fundamental data analysis problem with many applications in machine learning. In the turnstile model, Jain et al. [NeurIPS2023] initiated the study of this problem parameterized by the maximum flippancy of any element, i.e., the number of times that the count of an element changes from 0 to above 0 or vice versa. They give an item-level -differentially private algorithm whose additive error is tight with respect to that parameterization. In this work, we show that a very simple algorithm based on the sparse vector technique achieves a tight additive error for item-level -differential privacy and item-level -differential privacy with regards to a different parameterization, namely the sum of all flippancies. Our second result is a bound which shows that for a large class of algorithms, including all existing differentially private algorithms for this problem, the lower bound from item-level differential privacy extends to event-level differential privacy. This partially answers an open question by Jain et al. [NeurIPS2023].
Paper Structure (13 sections, 18 theorems, 25 equations, 1 figure, 1 table, 6 algorithms)

This paper contains 13 sections, 18 theorems, 25 equations, 1 figure, 1 table, 6 algorithms.

Key Result

Theorem 1

Let $d$ be a non-zero integer, $\beta>0$, $K$ be a known upper bound on the total flippancy, and let $T$ be a known upper bound on the number of time steps. Then there exists

Figures (1)

  • Figure :

Theorems & Definitions (31)

  • Definition 1: CountDistinct
  • Theorem 1
  • Theorem 2: Simplified version of Theorem \ref{['thm:lower_eps']}
  • Theorem 3: Simplified version of Theorem \ref{['thm:lowerbound_epsdel']}
  • Theorem 4: Simplified version of Theorem \ref{['thm:event-level']}
  • Theorem 5
  • Definition 2: Differential privacy Dwork2006
  • Definition 3: Laplace Distribution
  • Definition 4: Sensitivity
  • Corollary 1
  • ...and 21 more