Table of Contents
Fetching ...

Privacy Profiles for Private Selection

Antti Koskela, Rachel Redberg, Yu-Xiang Wang

TL;DR

This work introduces a recipe to bound the privacy profiles δ(ε) of private selection mechanisms by leveraging the base mechanisms' privacy profiles, bridging a gap left by Rényi-DP approaches. By formulating general bounds via probability generating functions and applying hockey-stick divergences, the authors derive distribution-specific results for truncated negative binomial and binomial number-of-rounds, yielding tighter $(ε,δ)$ guarantees and GDP-style insights. These bounds translate into practical gains for hyperparameter tuning in private learning, notably enabling DP-SGD and PTR-based approaches to evaluate far more candidates with reduced privacy costs. The methods offer a flexible, numerically friendly framework that improves end-to-end private learning performance and suggests new, more concentrated sampling strategies for the number of rounds.

Abstract

Private selection mechanisms (e.g., Report Noisy Max, Sparse Vector) are fundamental primitives of differentially private (DP) data analysis with wide applications to private query release, voting, and hyperparameter tuning. Recent work (Liu and Talwar, 2019; Papernot and Steinke, 2022) has made significant progress in both generalizing private selection mechanisms and tightening their privacy analysis using modern numerical privacy accounting tools, e.g., Rényi DP. But Rényi DP is known to be lossy when $(ε,δ)$-DP is ultimately needed, and there is a trend to close the gap by directly handling privacy profiles, i.e., $δ$ as a function of $ε$ or its equivalent dual form known as $f$-DPs. In this paper, we work out an easy-to-use recipe that bounds the privacy profiles of ReportNoisyMax and PrivateTuning using the privacy profiles of the base algorithms they corral. Numerically, our approach improves over the RDP-based accounting in all regimes of interest and leads to substantial benefits in end-to-end private learning experiments. Our analysis also suggests new distributions, e.g., binomial distribution for randomizing the number of rounds that leads to more substantial improvements in certain regimes.

Privacy Profiles for Private Selection

TL;DR

This work introduces a recipe to bound the privacy profiles δ(ε) of private selection mechanisms by leveraging the base mechanisms' privacy profiles, bridging a gap left by Rényi-DP approaches. By formulating general bounds via probability generating functions and applying hockey-stick divergences, the authors derive distribution-specific results for truncated negative binomial and binomial number-of-rounds, yielding tighter guarantees and GDP-style insights. These bounds translate into practical gains for hyperparameter tuning in private learning, notably enabling DP-SGD and PTR-based approaches to evaluate far more candidates with reduced privacy costs. The methods offer a flexible, numerically friendly framework that improves end-to-end private learning performance and suggests new, more concentrated sampling strategies for the number of rounds.

Abstract

Private selection mechanisms (e.g., Report Noisy Max, Sparse Vector) are fundamental primitives of differentially private (DP) data analysis with wide applications to private query release, voting, and hyperparameter tuning. Recent work (Liu and Talwar, 2019; Papernot and Steinke, 2022) has made significant progress in both generalizing private selection mechanisms and tightening their privacy analysis using modern numerical privacy accounting tools, e.g., Rényi DP. But Rényi DP is known to be lossy when -DP is ultimately needed, and there is a trend to close the gap by directly handling privacy profiles, i.e., as a function of or its equivalent dual form known as -DPs. In this paper, we work out an easy-to-use recipe that bounds the privacy profiles of ReportNoisyMax and PrivateTuning using the privacy profiles of the base algorithms they corral. Numerically, our approach improves over the RDP-based accounting in all regimes of interest and leads to substantial benefits in end-to-end private learning experiments. Our analysis also suggests new distributions, e.g., binomial distribution for randomizing the number of rounds that leads to more substantial improvements in certain regimes.
Paper Structure (25 sections, 30 theorems, 87 equations, 8 figures)

This paper contains 25 sections, 30 theorems, 87 equations, 8 figures.

Key Result

Lemma 2.2

A mechanism $\mathcal{M}$ satisfies $(\epsilon,\delta)$-DP if and only if, for $f(z) = [z - {\rm e}\space^{\epsilon}]_+$.

Figures (8)

  • Figure 1: Comparison of the $(\epsilon,\delta)$-bounds for the RNM mechanism \ref{['eq:M_argmax']} when the base mechanisms $\mathcal{M}_i$, $i \in [m]$, are 1-d Gaussian mechanisms with sensitivity 1 and noise scale $\sigma=4.0$, and when $\delta=10^{-6}$. Also plotted is the bound of Corollary \ref{['lem:convert2_eps_delta']}.
  • Figure 2: Comparison of various $(\epsilon,\delta)$-bounds when $K \sim \mathcal{D}_{\eta,\gamma}$ with $\eta=1$ (the geometric distribution, $m = \gamma^{-1}$) and $m=30,300,3000$. The base mechanism is the Gaussian mechanism with $L_2$-sensitivity 1 and noise parameter $\sigma=4$.
  • Figure 3: Growth of $\epsilon$-values for $\delta=10^{-6}$ for the RDP and hockey stick divergence based bounds as a function of $m$, when $K \sim \mathcal{D}_{\eta,\gamma}$ with $\eta=1$ and the base mechanism is the Gaussian mechanism with $L_2$-sensitivity 1 and noise parameter $\sigma=4$. The privacy profile bound given by Thm. \ref{['thm:negbin']} retains the $\mathcal{O}(\log^{\frac{1}{2}} \frac{m}{\delta})$ growth of $\epsilon$-values from the DP RNM (See Fig. \ref{['fig:comparison_argmax2']}).
  • Figure 4: Top: Comparison of the bound of Thm. \ref{['thm:hs_bin']} for $K \sim \textrm{Bin}(n,m/n)$ for different values of $n$, when $m=10$, and the RDP bound of Thm. \ref{['thm:rdp_poisson']}. The base mechanism is the Gaussian mechanism with sensitivity 1 and $\sigma=4.0$. Bottom: Comparison of the CDFs for different values of $n$. When comparing to the RDP bound, we see that at $\delta \approx 10^{-6}$ we get more concentrated $K$ for free by using the binomial distribution and Thm. \ref{['thm:hs_bin']}.
  • Figure 5: Linear regression problem on two UCI benchmark datasets. Tuning OPS-PTR (i.e., generalized PTR applied to the one-posterior sample algorithm) via the private selection algorithm outperforms baseline methods when the privacy cost of the tuning procedure is calculated using our Thm. \ref{['thm:negbin']}.
  • ...and 3 more figures

Theorems & Definitions (46)

  • Definition 2.1
  • Lemma 2.2: balle2018subsampling
  • Lemma 2.3
  • Definition 2.4
  • Theorem 3.1: zhu2022adaptive
  • Theorem 3.2
  • Theorem 3.3
  • Corollary 3.4
  • Theorem 4.1
  • Remark 4.2
  • ...and 36 more