Table of Contents
Fetching ...

N-output Mechanism: Estimating Statistical Information from Numerical Data under Local Differential Privacy

Incheol Baek, Yon Dohn Chung

TL;DR

This work proposes the N-output mechanism, a generalized framework that maps numerical data to one of $N$ discrete outputs and formulate the mechanism's design as an optimization problem to minimize estimation variance for any given $N $ and develops both numerical and analytical solutions.

Abstract

Local Differential Privacy (LDP) addresses significant privacy concerns in sensitive data collection. In this work, we focus on numerical data collection under LDP, targeting a significant gap in the literature: existing LDP mechanisms are optimized for either a very small ($|Ω| \in \{2, 3\}$) or infinite output spaces. However, no generalized method for constructing an optimal mechanism for an arbitrary output size $N$ exists. To fill this gap, we propose the \textbf{N-output mechanism}, a generalized framework that maps numerical data to one of $N$ discrete outputs. We formulate the mechanism's design as an optimization problem to minimize estimation variance for any given $N \geq 2$ and develop both numerical and analytical solutions. This results in a mechanism that is highly accurate and adaptive, as its design is determined by solving an optimization problem for any chosen $N$. Furthermore, we extend our framework and existing mechanisms to the task of distribution estimation. Empirical evaluations show that the N-output mechanism achieves state-of-the-art accuracy for mean, variance, and distribution estimation with small communication costs.

N-output Mechanism: Estimating Statistical Information from Numerical Data under Local Differential Privacy

TL;DR

This work proposes the N-output mechanism, a generalized framework that maps numerical data to one of discrete outputs and formulate the mechanism's design as an optimization problem to minimize estimation variance for any given and develops both numerical and analytical solutions.

Abstract

Local Differential Privacy (LDP) addresses significant privacy concerns in sensitive data collection. In this work, we focus on numerical data collection under LDP, targeting a significant gap in the literature: existing LDP mechanisms are optimized for either a very small () or infinite output spaces. However, no generalized method for constructing an optimal mechanism for an arbitrary output size exists. To fill this gap, we propose the \textbf{N-output mechanism}, a generalized framework that maps numerical data to one of discrete outputs. We formulate the mechanism's design as an optimization problem to minimize estimation variance for any given and develop both numerical and analytical solutions. This results in a mechanism that is highly accurate and adaptive, as its design is determined by solving an optimization problem for any chosen . Furthermore, we extend our framework and existing mechanisms to the task of distribution estimation. Empirical evaluations show that the N-output mechanism achieves state-of-the-art accuracy for mean, variance, and distribution estimation with small communication costs.

Paper Structure

This paper contains 32 sections, 10 theorems, 96 equations, 6 figures, 1 table, 2 algorithms.

Key Result

Proposition 1

For $[x_{j-1}, x_j]$, its maximum value within this interval must occur at one of the endpoints, $x_{j-1}$ or $x_j$, or at its vertex, $\bar{x}_j$, if the vertex lies within the interval. The vertex $\bar{x}_j$ is defined as:

Figures (6)

  • Figure 1: An example of the N-output mechanism for $N=5$ and $\epsilon=1$. The output values are $a_1 \approx 1.87$ and $a_2 \approx 2.57$. The symmetric outputs $a_{-1}, a_{-2}$ are omitted for clarity in (a).
  • Figure 2: Theoretical worst-case noise variance (Lower is better).
  • Figure 3: Worst-case mean estimation RMSE. Lower is better.
  • Figure 4: Mean estimation (RMSE, $\log$ scale) on real-world datasets. Lower is better.
  • Figure 5: Distribution estimation (Wasserstein distance) on real-world datasets. Lower is better.
  • ...and 1 more figures

Theorems & Definitions (30)

  • Definition 1
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Theorem 1
  • ...and 20 more