Table of Contents
Fetching ...

Locally Optimal Private Sampling: Beyond the Global Minimax

Hrad Ghoukasian, Bonwoo Lee, Shahab Asoodeh

TL;DR

The paper advances private sampling under local DP by introducing a local minimax framework around a public prior $P_0$, with neighborhood defined via $E_\gamma$-divergence. It extends the global functional-LDP minimax analysis to a local setting, proving that the local minimax risk in $N_\gamma(P_0)$ matches the restricted global minimax risk and deriving closed-form samplers that are independent of the specific $f$-divergence. It also provides a pure-LDP, nonlinear sampler that is instance-optimal on the local neighborhood, surpassing the linear functional-LDP mechanisms. Numerical experiments on finite and continuous spaces demonstrate that locally minimax samplers consistently outperform global minimax samplers under both pure LDP and GLDP, highlighting the practical benefits of incorporating public data into privacy-preserving data generation. The work lays groundwork for personalized private sampling and suggests several avenues for future work, including scalable implementations and generalized neighborhood definitions.

Abstract

We study the problem of sampling from a distribution under local differential privacy (LDP). Given a private distribution $P \in \mathcal{P}$, the goal is to generate a single sample from a distribution that remains close to $P$ in $f$-divergence while satisfying the constraints of LDP. This task captures the fundamental challenge of producing realistic-looking data under strong privacy guarantees. While prior work by Park et al. (NeurIPS'24) focuses on global minimax-optimality across a class of distributions, we take a local perspective. Specifically, we examine the minimax risk in a neighborhood around a fixed distribution $P_0$, and characterize its exact value, which depends on both $P_0$ and the privacy level. Our main result shows that the local minimax risk is determined by the global minimax risk when the distribution class $\mathcal{P}$ is restricted to a neighborhood around $P_0$. To establish this, we (1) extend previous work from pure LDP to the more general functional LDP framework, and (2) prove that the globally optimal functional LDP sampler yields the optimal local sampler when constrained to distributions near $P_0$. Building on this, we also derive a simple closed-form expression for the locally minimax-optimal samplers which does not depend on the choice of $f$-divergence. We further argue that this local framework naturally models private sampling with public data, where the public data distribution is represented by $P_0$. In this setting, we empirically compare our locally optimal sampler to existing global methods, and demonstrate that it consistently outperforms global minimax samplers.

Locally Optimal Private Sampling: Beyond the Global Minimax

TL;DR

The paper advances private sampling under local DP by introducing a local minimax framework around a public prior , with neighborhood defined via -divergence. It extends the global functional-LDP minimax analysis to a local setting, proving that the local minimax risk in matches the restricted global minimax risk and deriving closed-form samplers that are independent of the specific -divergence. It also provides a pure-LDP, nonlinear sampler that is instance-optimal on the local neighborhood, surpassing the linear functional-LDP mechanisms. Numerical experiments on finite and continuous spaces demonstrate that locally minimax samplers consistently outperform global minimax samplers under both pure LDP and GLDP, highlighting the practical benefits of incorporating public data into privacy-preserving data generation. The work lays groundwork for personalized private sampling and suggests several avenues for future work, including scalable implementations and generalized neighborhood definitions.

Abstract

We study the problem of sampling from a distribution under local differential privacy (LDP). Given a private distribution , the goal is to generate a single sample from a distribution that remains close to in -divergence while satisfying the constraints of LDP. This task captures the fundamental challenge of producing realistic-looking data under strong privacy guarantees. While prior work by Park et al. (NeurIPS'24) focuses on global minimax-optimality across a class of distributions, we take a local perspective. Specifically, we examine the minimax risk in a neighborhood around a fixed distribution , and characterize its exact value, which depends on both and the privacy level. Our main result shows that the local minimax risk is determined by the global minimax risk when the distribution class is restricted to a neighborhood around . To establish this, we (1) extend previous work from pure LDP to the more general functional LDP framework, and (2) prove that the globally optimal functional LDP sampler yields the optimal local sampler when constrained to distributions near . Building on this, we also derive a simple closed-form expression for the locally minimax-optimal samplers which does not depend on the choice of -divergence. We further argue that this local framework naturally models private sampling with public data, where the public data distribution is represented by . In this setting, we empirically compare our locally optimal sampler to existing global methods, and demonstrate that it consistently outperforms global minimax samplers.

Paper Structure

This paper contains 37 sections, 17 theorems, 253 equations, 11 figures, 1 table.

Key Result

Proposition 3.3

Let $g$ be a trade-off function. Under Assumption assumption: norm, if $1 + g^*(-e^\beta) > \frac{(c_2 - c_1 e^\beta)(1 - c_1)}{c_2 - c_1}$ for all $\beta \geq 0$, then $\mathbf{Q}(P) = P$ satisfies $g$-FLDP.

Figures (11)

  • Figure 1: Comparison of global and local minimax-optimal sampler under pure LDP ($\varepsilon = 1$) and $\nu$-GLDP ($\nu = 1.5$). Full details of this experiment are provided in Appendix \ref{['appendix: laplace mixture']}.
  • Figure 2: Illustrations of the optimal samplers described in Theorems \ref{['thm: local functional']} and \ref{['thm: local pure']}. Left:$\mathbf{Q}^\star_{g, N_\gamma(P_0)}$ for functional LDP. Right:$\mathbf{Q}^\star_{\varepsilon, N_\gamma(P_0)}$ for pure LDP. Following park2024exactly, $M_\varepsilon$ is defined as $M_\varepsilon \coloneqq \{ Q \in \mathcal{C}(\mathbb{R}^n): \space \frac{\gamma + 1}{\gamma + e^\varepsilon} \, p_0(x) \leq q(x) \leq \frac{(\gamma + 1)e^\varepsilon}{\gamma + e^\varepsilon} \, p_0(x), \space\forall x \in \mathbb{R}^n \}$.
  • Figure 3: Theoretical worst-case $f$-divergences of global and local minimax samplers under the pure LDP setting with uniform reference distribution $\mu_k$ over finite space ($k = 20$).
  • Figure 4: Empirical worst-case $f$-divergences of global and local minimax samplers under the pure LDP setting, over 100 experiments on a 1-D Laplace mixture.
  • Figure 5: Theoretical worst-case $f$-divergences of global and local minimax samplers under the pure LDP setting with uniform reference distribution $\mu_k$ over finite space ($k = 10$).
  • ...and 6 more figures

Theorems & Definitions (37)

  • Definition 2.1: Approximate LDP
  • Definition 2.2: Trade--off function
  • Definition 2.3: Functional LDP dong2022gaussianlee2023minimax
  • Definition 2.4: $f$-divergence ali1966generalcsiszar1967information
  • Definition 2.5: Decomposability park2024exactly
  • Example 3.1
  • Proposition 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Theorem 3.6
  • ...and 27 more