Efficient Membership Inference Attacks by Bayesian Neural Network
Zhenlong Liu, Wenyu Jiang, Feng Zhou, Hongxin Wei
TL;DR
This work tackles privacy risks in supervised learning by addressing Membership Inference Attacks (MIAs) through a Bayes-aware lens. It introduces Bayesian Membership Inference Attack (BMIA), which converts a single trained reference model into a Bayesian neural network via Laplace approximation to obtain a predictive distribution that captures both epistemic and aleatoric uncertainty, enabling a conditional, per-example attack without training multiple shadow models. The approach achieves state-of-the-art or competitive results across five datasets (Texas100, Purchase100, CIFAR-10/100, ImageNet) with significantly reduced computational cost, demonstrated by substantial TPR gains at very low FPRs and cheaper training time. The paper also provides theoretical insights into why conditional attacks outperform marginal ones and analyzes the impact of sample size and Hessian factorization on performance, with robust behavior under model mismatch and OOD conditions. Overall, BMIA offers an efficient and robust toolkit for auditing privacy risks in neural networks by leveraging last-layer Laplace inference to quantify conditional score distributions.
Abstract
Membership Inference Attacks (MIAs) aim to estimate whether a specific data point was used in the training of a given model. Previous attacks often utilize multiple reference models to approximate the conditional score distribution, leading to significant computational overhead. While recent work leverages quantile regression to estimate conditional thresholds, it fails to capture epistemic uncertainty, resulting in bias in low-density regions. In this work, we propose a novel approach - Bayesian Membership Inference Attack (BMIA), which performs conditional attack through Bayesian inference. In particular, we transform a trained reference model into Bayesian neural networks by Laplace approximation, enabling the direct estimation of the conditional score distribution by probabilistic model parameters. Our method addresses both epistemic and aleatoric uncertainty with only a reference model, enabling efficient and powerful MIA. Extensive experiments on five datasets demonstrate the effectiveness and efficiency of BMIA.
