Table of Contents
Fetching ...

Preserving Target Distributions With Differentially Private Count Mechanisms

Nitin Kohli, Paul Laskowski

Abstract

Differentially private mechanisms are increasingly used to publish tables of counts, where each entry represents the number of individuals belonging to a particular category. A distribution of counts summarizes the information in the count column, unlinking counts from categories. This object is useful for answering a class of research questions, but it is subject to statistical biases when counts are privatized with standard mechanisms. This motivates a novel design criterion we term accuracy of distribution. This study formalizes a two-stage framework for privatizing tables of counts that balances accuracy of distribution with two standard criteria of accuracy of counts and runtime. In the first stage, a distribution privatizer generates an estimate for the true distribution of counts. We introduce a new mechanism, called the cyclic Laplace, specifically tailored to distributions of counts, that outperforms existing general-purpose differentially private histogram mechanisms. In the second stage, a constructor algorithm generates a count mechanism, represented as a transition matrix, whose fixed-point is the privatized distribution of counts. We develop a mathematical theory that describes such transition matrices in terms of simple building blocks we call epsilon-scales. This theory informs the design of a new constructor algorithm that generates transition matrices with favorable properties more efficiently than standard optimization algorithms. We explore the practicality of our framework with a set of experiments, highlighting situations in which a fixed-point method provides a favorable tradeoff among performance criteria.

Preserving Target Distributions With Differentially Private Count Mechanisms

Abstract

Differentially private mechanisms are increasingly used to publish tables of counts, where each entry represents the number of individuals belonging to a particular category. A distribution of counts summarizes the information in the count column, unlinking counts from categories. This object is useful for answering a class of research questions, but it is subject to statistical biases when counts are privatized with standard mechanisms. This motivates a novel design criterion we term accuracy of distribution. This study formalizes a two-stage framework for privatizing tables of counts that balances accuracy of distribution with two standard criteria of accuracy of counts and runtime. In the first stage, a distribution privatizer generates an estimate for the true distribution of counts. We introduce a new mechanism, called the cyclic Laplace, specifically tailored to distributions of counts, that outperforms existing general-purpose differentially private histogram mechanisms. In the second stage, a constructor algorithm generates a count mechanism, represented as a transition matrix, whose fixed-point is the privatized distribution of counts. We develop a mathematical theory that describes such transition matrices in terms of simple building blocks we call epsilon-scales. This theory informs the design of a new constructor algorithm that generates transition matrices with favorable properties more efficiently than standard optimization algorithms. We explore the practicality of our framework with a set of experiments, highlighting situations in which a fixed-point method provides a favorable tradeoff among performance criteria.

Paper Structure

This paper contains 32 sections, 28 theorems, 57 equations, 8 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

The cyclic Laplace mechanism for distributions of counts satisfies $\epsilon$-differential privacy. $\blacktriangleleft$$\blacktriangleleft$

Figures (8)

  • Figure 1: A. Table of counts of bird flu cases by U.S. state as of February 27, 2025. B. The corresponding distribution of counts. In this figure, $N = 50$ and $n = 41$.
  • Figure 2: A framework for constructing and utilizing count mechanisms when the true distribution must be protected.
  • Figure 3: Comparison of distribution error for the cyclic Laplace mechanism against other mechanisms with $\epsilon_1 = 1$.
  • Figure 4: Distributions of the three datasets used in our experiments: (A) synthetic draws from a binomial distribution, (B) observed counts of homicides in US counties, and (C) observed counts of teachers in public schools.
  • Figure 5: Distribution error under different total privacy budgets on logarithmic scale. All fixed-point constructors were visually overlapping, so we include only the heuristic with sandwich selector to represent all fixed-point constructors.
  • ...and 3 more figures

Theorems & Definitions (50)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Lemma 1
  • Corollary 1
  • Theorem 2
  • ...and 40 more