Table of Contents
Fetching ...

Differentially Private Selection using Smooth Sensitivity

Akito Yamamoto, Tetsuo Shibuya

TL;DR

This study proposes a novel mechanism for differentially private selection using the concept of smooth sensitivity and presents theoretical proofs of strict privacy guarantees, and presents fundamental theorems to improve upon them.

Abstract

With the growing volume of data in society, the need for privacy protection in data analysis also rises. In particular, private selection tasks, wherein the most important information is retrieved under differential privacy are emphasized in a wide range of contexts, including machine learning and medical statistical analysis. However, existing mechanisms use global sensitivity, which may add larger amount of perturbation than is necessary. Therefore, this study proposes a novel mechanism for differentially private selection using the concept of smooth sensitivity and presents theoretical proofs of strict privacy guarantees. Simultaneously, given that the current state-of-the-art algorithm using smooth sensitivity is still of limited use, and that the theoretical analysis of the basic properties of the noise distributions are not yet rigorous, we present fundamental theorems to improve upon them. Furthermore, new theorems are proposed for efficient noise generation. Experiments demonstrate that the proposed mechanism can provide higher accuracy than the existing global sensitivity-based methods. Finally, we show key directions for further theoretical development. Overall, this study can be an important foundational work for expanding the potential of smooth sensitivity in privacy-preserving data analysis. The Python implementation of our experiments and supplemental results are available at https://github.com/ay0408/Smooth-Private-Selection.

Differentially Private Selection using Smooth Sensitivity

TL;DR

This study proposes a novel mechanism for differentially private selection using the concept of smooth sensitivity and presents theoretical proofs of strict privacy guarantees, and presents fundamental theorems to improve upon them.

Abstract

With the growing volume of data in society, the need for privacy protection in data analysis also rises. In particular, private selection tasks, wherein the most important information is retrieved under differential privacy are emphasized in a wide range of contexts, including machine learning and medical statistical analysis. However, existing mechanisms use global sensitivity, which may add larger amount of perturbation than is necessary. Therefore, this study proposes a novel mechanism for differentially private selection using the concept of smooth sensitivity and presents theoretical proofs of strict privacy guarantees. Simultaneously, given that the current state-of-the-art algorithm using smooth sensitivity is still of limited use, and that the theoretical analysis of the basic properties of the noise distributions are not yet rigorous, we present fundamental theorems to improve upon them. Furthermore, new theorems are proposed for efficient noise generation. Experiments demonstrate that the proposed mechanism can provide higher accuracy than the existing global sensitivity-based methods. Finally, we show key directions for further theoretical development. Overall, this study can be an important foundational work for expanding the potential of smooth sensitivity in privacy-preserving data analysis. The Python implementation of our experiments and supplemental results are available at https://github.com/ay0408/Smooth-Private-Selection.

Paper Structure

This paper contains 25 sections, 10 theorems, 51 equations, 2 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

$($$\epsilon$-differentially private algorithm using a smooth upper bound 3$)$ Let $h$ be an $(\alpha,\beta)$-admissible noise probability density function, and set $\alpha = \alpha(\epsilon)$ and $\beta = \beta(\epsilon)$. Let $Z$ be a random variable derived from $h$. For a function $f: D^n \right

Figures (2)

  • Figure 1: Averaged accuracy when (a) $m = 5$, (b) $m = 10$, (c) $m = 15$, and (d) $m = 20$. The $x$-axis represents the achieved privacy level $\epsilon$. The $y$-axis represents the probability that the most significant SNP was correctly extracted. We compared the exponential mechanism (EM), the permute-and-flip (PF), and our smooth private selection (SPS). The error bar represents the range of all results in the five attempts.
  • Figure 2: Averaged accuracy when (a) $m = 5$, (b) $m = 10$, (c) $m = 15$, and (d) $m = 20$. The $x$-axis represents the achieved privacy level $\epsilon$. The $y$-axis represents the probability that the most significant SNP was correctly extracted. We compared the noise distributions with $\gamma = 2$, $4$, $6$, and $10$ using our smooth private selection. The error bar represents the range of all results in the five attempts.

Theorems & Definitions (27)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Lemma 1
  • Definition 7
  • Definition 8
  • Theorem 1
  • ...and 17 more