MCMC for Bayesian estimation of Differential Privacy from Membership Inference Attacks
Ceren Yildirim, Kamer Kaya, Sinan Yildirim, Erkay Savas
TL;DR
This work tackles empirical privacy assessment for differential privacy by grounding it in a Bayesian framework that leverages multiple membership inference attacks (MIAs). It introduces MCMC-DP-Est, a latent-variable Markov chain Monte Carlo method that jointly infers the DP privacy parameter $\epsilon$ and an attack-strength parameter $s$ from observed false-positive and false-negative counts across several challenge bases, without presuming the strongest possible attack. The method builds a hierarchical model linking $\epsilon$, $\delta$, and MIA performance, and provides a practical way to measure MIA performance via LiRA-inspired loss statistics. Experiments on artificial data and real data (MNIST) show that the approach yields coherent posterior distributions for $\epsilon$ and $s$, with results aligning with attack performance and robust to varying randomness. The framework supports combining multiple MIAs and attack strategies, enabling cautious privacy auditing that avoids overconfident estimates tied to idealized, strongest-attack assumptions.
Abstract
We propose a new framework for Bayesian estimation of differential privacy, incorporating evidence from multiple membership inference attacks (MIA). Bayesian estimation is carried out via a Markov chain Monte Carlo (MCMC) algorithm, named MCMC-DP-Est, which provides an estimate of the full posterior distribution of the privacy parameter (e.g., instead of just credible intervals). Critically, the proposed method does not assume that privacy auditing is performed with the most powerful attack on the worst-case (dataset, challenge point) pair, which is typically unrealistic. Instead, MCMC-DP-Est jointly estimates the strengths of MIAs used and the privacy of the training algorithm, yielding a more cautious privacy analysis. We also present an economical way to generate measurements for the performance of an MIA that is to be used by the MCMC method to estimate privacy. We present the use of the methods with numerical examples with both artificial and real data.
