Table of Contents
Fetching ...

Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST)

Xingyuan Zhao, Ruyu Zhou, Fang Liu

TL;DR

The theoretical and empirical results suggest that MUST offers stronger PA in $\epsilon$ than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on $\delta$ vary case by case.

Abstract

Applying a randomized algorithm to a subset rather than the entire dataset amplifies privacy guarantees. We propose a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification (PA) in the context of differential privacy (DP). We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures through newly introduced concept including strong vs weak PA effects and aligned privacy profile. We provide the privacy loss composition analysis over repeated applications of MUST via the Fourier accountant algorithm. Our theoretical and empirical results suggest that MUST offers stronger PA in $ε$ than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on $δ$ vary case by case. Our experiments show that MUST is non-inferior in the utility and stability of privacy-preserving (PP) outputs to one-stage subsampling methods at similar privacy loss while enhancing the computational efficiency of algorithms that require complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures that involve parallel or simultaneous subsampling when DP guarantees are necessary.

Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST)

TL;DR

The theoretical and empirical results suggest that MUST offers stronger PA in than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on vary case by case.

Abstract

Applying a randomized algorithm to a subset rather than the entire dataset amplifies privacy guarantees. We propose a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification (PA) in the context of differential privacy (DP). We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures through newly introduced concept including strong vs weak PA effects and aligned privacy profile. We provide the privacy loss composition analysis over repeated applications of MUST via the Fourier accountant algorithm. Our theoretical and empirical results suggest that MUST offers stronger PA in than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on vary case by case. Our experiments show that MUST is non-inferior in the utility and stability of privacy-preserving (PP) outputs to one-stage subsampling methods at similar privacy loss while enhancing the computational efficiency of algorithms that require complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures that involve parallel or simultaneous subsampling when DP guarantees are necessary.
Paper Structure (32 sections, 6 theorems, 33 equations, 11 figures, 10 tables, 2 algorithms)

This paper contains 32 sections, 6 theorems, 33 equations, 11 figures, 10 tables, 2 algorithms.

Key Result

Theorem 1

(privacy profiles of Laplace and Gaussian mechanisms) Let $f$ be an output function with $\ell_1$ global sensitivity $\Delta_1$ and $\ell_2$ global sensitivity $\Delta_2$The $\ell_p$ global sensitivity liu2018generalized of a function $f: X\to \mathbb{R}^d$ is $\Delta_p(f)=\sup_{X\simeq X'} ||f(X)-f

Figures (11)

  • Figure 1: A general MUST procedure. WR and WOR refer to sampling with and without replacement, respectively; the number in each box is the respective dataset size.
  • Figure 2: 2-stage MUST procedure (numbers in boxes are dataset sizes)
  • Figure 3: PA effect of a sampling procedure $\mathcal{S}$ on a base randomized mechanism, When $\epsilon'/\epsilon<1$ and $\delta'-\delta<0$, $\mathcal{S}$ yields strong PA; when one of the conditions is not satisfied, $\mathcal{S}$ yields weak PA; when both conditions fail, $\mathcal{S}$ leads privacy dilution.
  • Figure 4: Aligned privacy profile plots of $\epsilon'/\epsilon$ vs $(\delta'-\delta)$ for the Laplace and Gaussian mechanisms with subsampling schemes WOR (Poisson), WR, MUSTow and MUSTww when $n=1000, m=400$, and $b=500$.
  • Figure 5: Contour plots of $\eta$ vs $(b, m)$ for MUST$\circ\mathcal{M}$ ($\mathcal{M}$ is a generic base mechanism) and $(\delta'-\delta)$ vs $(b, m)$ for MUST $\circ$ Laplace mechanism with $\Delta_1/\sigma\!=\!1$ and MUST $\circ$ Gaussian mechanism with $\Delta_2/\sigma\!=\!1$ ($n=1000$, $\epsilon=1$ for the base mechanism)
  • ...and 6 more figures

Theorems & Definitions (16)

  • Definition 1: $(\epsilon,\delta)$-DP dwork2006calibratingdwork2006our
  • Theorem 1
  • Definition 2: privacy loss random variable dwork2016concentrated
  • Definition 3: privacy loss distribution (PLD) sommer2018privacy
  • Lemma 2: relation between $\omega_{X/X'}$ and $\omega_{X'/X}$sommer2018privacy
  • Theorem 3: privacy profile in $k$-fold composition sommer2018privacy
  • Definition 4: 2-step MUST
  • Theorem 4: privacy profile of $\mathcal{M}\circ$MUST
  • Corollary 5: equivalence between MUSTwo and WR
  • Definition 5: PA types
  • ...and 6 more