Table of Contents
Fetching ...

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu

TL;DR

This paper addresses the gap between theory and practice in private learning by introducing Approximate Sample Perturbation (ASAP), an MCMC-based method that preserves pure differential privacy while sampling from an approximate posterior. ASAP perturbs an MCMC sample in proportion to its $W_{\infty}$ distance to a DP reference, and leverages a constrained Metropolis-adjusted Langevin algorithm (MALA) to achieve convergence guarantees in $W_{\infty}$ distance. A key technical contribution is a TV-to-$W_{\infty}$ conversion lemma that enables pure DP guarantees for approximate samplers, enabling end-to-end localization to a bounded domain. The end-to-end localized ASAP yields near-linear-time DP-ERM with optimal rates for strongly convex and smooth losses under both pure DP and Gaussian DP, representing the first such result in this regime. The approach provides a practical bridge between private Bayesian learning and computational efficiency, with potential applicability beyond DP-ERM where DP-preserving sampling is required.

Abstract

Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $δ$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $δ=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in $W_\infty$ distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

TL;DR

This paper addresses the gap between theory and practice in private learning by introducing Approximate Sample Perturbation (ASAP), an MCMC-based method that preserves pure differential privacy while sampling from an approximate posterior. ASAP perturbs an MCMC sample in proportion to its distance to a DP reference, and leverages a constrained Metropolis-adjusted Langevin algorithm (MALA) to achieve convergence guarantees in distance. A key technical contribution is a TV-to- conversion lemma that enables pure DP guarantees for approximate samplers, enabling end-to-end localization to a bounded domain. The end-to-end localized ASAP yields near-linear-time DP-ERM with optimal rates for strongly convex and smooth losses under both pure DP and Gaussian DP, representing the first such result in this regime. The approach provides a practical bridge between private Bayesian learning and computational efficiency, with potential applicability beyond DP-ERM where DP-preserving sampling is required.

Abstract

Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides -pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by -approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing -approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity () distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., ). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.
Paper Structure (40 sections, 21 theorems, 32 equations, 3 figures, 2 tables, 4 algorithms)

This paper contains 40 sections, 21 theorems, 32 equations, 3 figures, 2 tables, 4 algorithms.

Key Result

Lemma 4

Assume the loss function is $G$-Lipschitz, posterior sampling mechanism with parameter $\gamma, \lambda > 0$ satisfying $\gamma \leq \mu^2 \lambda / G^2$ satisfies $\mu$-GDP.

Figures (3)

  • Figure 1: Two examples illustrating the couplings of $\Tilde{p}$ and $p^*$. Let $\zeta^*$ be the optimal coupling of $W_{\infty}(\Tilde{p}, p^*)$, and let $\Tilde{p}\otimes p^*$ denote the independent coupling. In both scenarios, the marginal distributions are $\Tilde{p}$ and $p^*$. Denote $\Delta=W_{\infty}(\Tilde{p}, p^*)$. In Figure (a), when $(\Tilde{\theta},\theta^*)$ follows the optimal coupling, $(\Tilde{\theta},\theta^*)$ is confined within the band $|\Tilde{\theta}-\theta^*|\leq \Delta$. Conversely, Figure (b) shows that when $\Tilde{\theta}$ and $\theta^*$ are independently sampled, the distance $|\Tilde{\theta}-\theta^*|$ can take relatively large values with positive probability. Through the appropriate coupling of the distributions $\Tilde{p}$ and $p^*$, particularly via the optimal coupling $\zeta^*$, we obtain a tight almost-sure bound $\Delta$ on the distance between the two samples $\Tilde{\theta}$ and $\theta^*$.
  • Figure 2: Excess empirical risks from \ref{['tab:summary_strongconvex']} for strongly convex losses. Here $d=11, G=300, \alpha=4,\varepsilon=1.$ Left $n=1e4$. Right $n=1e6$.
  • Figure 3: Excess empirical risks for strongly convex losses on Wine Quality -- Red dataset.

Theorems & Definitions (31)

  • Definition 1: Differential privacy dwork2006calibratingdwork2014algorithmic
  • Definition 2: Hockey-Stick Divergence
  • Definition 3: Gaussian Differential Privacy dong2022gaussian
  • Lemma 4: GDP of posterior sampling gopi2022private
  • Lemma 5: Pure DP of posterior sampling
  • Lemma 6: deklerk2018comparison
  • Definition 7
  • Lemma 8: Converting TV distance to $W_\infty$ distance
  • Lemma 9
  • Theorem 1: Mixing time in TV distance
  • ...and 21 more