Table of Contents
Fetching ...

Optimal partition selection with Rényi differential privacy

Charlie Harrison, Pasin Manurangsi

TL;DR

An extension of the optimal algorithm tuned for bounded weighted partition selection which can be used as a drop-in improvement over the Gaussian mechanism any time the partition frequency is not also needed and can be easily plugged into state of the art partition selection algorithms.

Abstract

A common problem in private data analysis is the partition selection problem, where each user holds a set of partitions (e.g. keys in a GROUP BY operation) from a possibly unbounded set. The challenge here is in maximizing the set of released partitions while respecting a differential privacy constraint. Previous work [Desfontaines et al., PoPETS 2022] presented an optimal $(\varepsilon, δ)$-DP algorithm when each user submits only a single partition. We generalize this approach to find the optimal algorithm under $δ$-approximate $(α, \varepsilon)$-Rényi differential privacy (RDP), which allows much tighter analysis under composition. Motivated by the non-existence of a general optimality result in the case where users submit multiple partitions each, we present an extension of our optimal algorithm tuned for $L^2$ bounded weighted partition selection which can be used as a drop-in improvement over the Gaussian mechanism any time the partition frequency is not also needed. We show that our primitive can be easily plugged into state of the art partition selection algorithms (PolicyGaussian from [Gopi et al., ICML 2020] and MAD2R from [Chen et al., ICML 2025]), improving performance both for parallel and sequential adaptive algorithms. Finally, we show that there is an inherent cost to algorithms which do support releasing the frequency as well as the partitions. Specifically, we formulate a basic notion of optimal approximate RDP algorithm for partition selection using additive noise, and show that there is a numerical separation between additive and non-additive noise mechanisms for this problem.

Optimal partition selection with Rényi differential privacy

TL;DR

An extension of the optimal algorithm tuned for bounded weighted partition selection which can be used as a drop-in improvement over the Gaussian mechanism any time the partition frequency is not also needed and can be easily plugged into state of the art partition selection algorithms.

Abstract

A common problem in private data analysis is the partition selection problem, where each user holds a set of partitions (e.g. keys in a GROUP BY operation) from a possibly unbounded set. The challenge here is in maximizing the set of released partitions while respecting a differential privacy constraint. Previous work [Desfontaines et al., PoPETS 2022] presented an optimal -DP algorithm when each user submits only a single partition. We generalize this approach to find the optimal algorithm under -approximate -Rényi differential privacy (RDP), which allows much tighter analysis under composition. Motivated by the non-existence of a general optimality result in the case where users submit multiple partitions each, we present an extension of our optimal algorithm tuned for bounded weighted partition selection which can be used as a drop-in improvement over the Gaussian mechanism any time the partition frequency is not also needed. We show that our primitive can be easily plugged into state of the art partition selection algorithms (PolicyGaussian from [Gopi et al., ICML 2020] and MAD2R from [Chen et al., ICML 2025]), improving performance both for parallel and sequential adaptive algorithms. Finally, we show that there is an inherent cost to algorithms which do support releasing the frequency as well as the partitions. Specifically, we formulate a basic notion of optimal approximate RDP algorithm for partition selection using additive noise, and show that there is a numerical separation between additive and non-additive noise mechanisms for this problem.
Paper Structure (32 sections, 15 theorems, 39 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 32 sections, 15 theorems, 39 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Proposition 5

Let $M_1$ and $M_2$ satisfy $(\delta_1, \alpha, \varepsilon_1)$ and $(\delta_2, \alpha, \varepsilon_2)$ RDP, respectively. Then $M_1 \circ M_2$ satisfies $(\delta, \alpha, \varepsilon_1 + \varepsilon_2)$ RDP, where $\delta = \delta_1 + \delta_2 - \delta_1 \cdot \delta_2 \le \delta_1 + \delta_2.$ In

Figures (3)

  • Figure 1: Gaussian and SNAPS mechanism release probabilities for $(1, 10^{-5})$-DP. SNAPS is parameterized as in \ref{['sec:exp']}. The Gaussian probabilities are determined following gopi20achen2025scalable, i.e. by splitting the $\delta$ budget evenly in two, one for computing the analytical Gaussian $\sigma$, and one for determining the threshold $\tau = \max_{k \in [\Delta_0]} \{1/\sqrt{k} + \sigma \cdot \Phi^{-1}\left((1-\delta/2)^{1/k}\right)\}$.
  • Figure 2: Probability mass functions (centered at 0) for optimal additive noise distributions satisfying $\pi(61) = 1$ at various values of $\alpha$ as minimized by the convex program in \ref{['thm:opt-additive-simplified']}. As $\alpha$ grows, the optimal distribution converges to a truncated discrete Laplace (\ref{['prop:converge-discrete-laplace']}). At smaller $\alpha$ the optimal distribution becomes platykurtic, with a flatter peaks and thinner tails.
  • Figure 3: The privacy of various mechanisms under the constraint that $\pi(61) = 1$ i.e. additive noise must be bounded in $[0, 60]$ (or equivalently $[-30, 30]$ due to translation invariance). $\pi^*$ clearly dominates all additive mechanisms for small and moderate $\alpha$. The (truncated) Gaussian and Laplace plots were computed by numerically solving for the scale parameters that ensure $f(-30)=f(30)=\delta$ in their respective PMFs after truncation and normalization.

Theorems & Definitions (40)

  • Definition 1: Differential privacy dwork-calibrating
  • Definition 2: Rényi divergence renyi61
  • Definition 3: Approximate Rényi divergence papernot2022hyperparameter
  • Definition 4: Approximate Rényi DP papernot2022hyperparameter
  • Proposition 5: Approximate RDP composition
  • Proposition 6
  • Definition 7: Private partition selection
  • Definition 8: Differentially private partition selection primitive
  • Remark 9
  • Definition 10
  • ...and 30 more