Do Parameters Reveal More than Loss for Membership Inference?
Anshuman Suri, Xiao Zhang, David Evans
TL;DR
This paper shows that black-box membership inference is generally insufficient for models trained with stochastic gradient descent and that optimal leakage requires access to model parameters. It derives a theoretical framework linking SGD dynamics, the Hessian, and the posterior over parameters to MI, and introduces the Inverse Hessian Attack (IHA) as a white-box auditing method that scores records using inverse-Hessian-vector products and gradient signals. Empirically, IHA matches or surpasses state-of-the-art reference-model attacks on several datasets, highlights the critical role of Hessian-informed terms, and demonstrates that IHA can audit privacy leakage without training reference models, albeit with notable computational costs. The findings advocate for broader investigation of white-box MI methods, careful consideration of Hessian structure, and practical damping/approximation strategies to enable scalable privacy auditing in real-world systems.
Abstract
Membership inference attacks are used as a key tool for disclosure auditing. They aim to infer whether an individual record was used to train a model. While such evaluations are useful to demonstrate risk, they are computationally expensive and often make strong assumptions about potential adversaries' access to models and training environments, and thus do not provide tight bounds on leakage from potential attacks. We show how prior claims around black-box access being sufficient for optimal membership inference do not hold for stochastic gradient descent, and that optimal membership inference indeed requires white-box access. Our theoretical results lead to a new white-box inference attack, IHA (Inverse Hessian Attack), that explicitly uses model parameters by taking advantage of computing inverse-Hessian vector products. Our results show that both auditors and adversaries may be able to benefit from access to model parameters, and we advocate for further research into white-box methods for membership inference.
