Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy
Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu
TL;DR
This paper addresses the gap between theory and practice in private learning by introducing Approximate Sample Perturbation (ASAP), an MCMC-based method that preserves pure differential privacy while sampling from an approximate posterior. ASAP perturbs an MCMC sample in proportion to its $W_{\infty}$ distance to a DP reference, and leverages a constrained Metropolis-adjusted Langevin algorithm (MALA) to achieve convergence guarantees in $W_{\infty}$ distance. A key technical contribution is a TV-to-$W_{\infty}$ conversion lemma that enables pure DP guarantees for approximate samplers, enabling end-to-end localization to a bounded domain. The end-to-end localized ASAP yields near-linear-time DP-ERM with optimal rates for strongly convex and smooth losses under both pure DP and Gaussian DP, representing the first such result in this regime. The approach provides a practical bridge between private Bayesian learning and computational efficiency, with potential applicability beyond DP-ERM where DP-preserving sampling is required.
Abstract
Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $δ$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $δ=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in $W_\infty$ distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.
