Do Parameters Reveal More than Loss for Membership Inference?

Anshuman Suri; Xiao Zhang; David Evans

Do Parameters Reveal More than Loss for Membership Inference?

Anshuman Suri, Xiao Zhang, David Evans

TL;DR

This paper shows that black-box membership inference is generally insufficient for models trained with stochastic gradient descent and that optimal leakage requires access to model parameters. It derives a theoretical framework linking SGD dynamics, the Hessian, and the posterior over parameters to MI, and introduces the Inverse Hessian Attack (IHA) as a white-box auditing method that scores records using inverse-Hessian-vector products and gradient signals. Empirically, IHA matches or surpasses state-of-the-art reference-model attacks on several datasets, highlights the critical role of Hessian-informed terms, and demonstrates that IHA can audit privacy leakage without training reference models, albeit with notable computational costs. The findings advocate for broader investigation of white-box MI methods, careful consideration of Hessian structure, and practical damping/approximation strategies to enable scalable privacy auditing in real-world systems.

Abstract

Membership inference attacks are used as a key tool for disclosure auditing. They aim to infer whether an individual record was used to train a model. While such evaluations are useful to demonstrate risk, they are computationally expensive and often make strong assumptions about potential adversaries' access to models and training environments, and thus do not provide tight bounds on leakage from potential attacks. We show how prior claims around black-box access being sufficient for optimal membership inference do not hold for stochastic gradient descent, and that optimal membership inference indeed requires white-box access. Our theoretical results lead to a new white-box inference attack, IHA (Inverse Hessian Attack), that explicitly uses model parameters by taking advantage of computing inverse-Hessian vector products. Our results show that both auditors and adversaries may be able to benefit from access to model parameters, and we advocate for further research into white-box methods for membership inference.

Do Parameters Reveal More than Loss for Membership Inference?

TL;DR

Abstract

Paper Structure (29 sections, 5 theorems, 35 equations, 1 figure, 9 tables)

This paper contains 29 sections, 5 theorems, 35 equations, 1 figure, 9 tables.

Introduction
Preliminaries
Membership Inference
Discrete-time SGD Dynamics
Black-Box Access is not Sufficient
Limitations of Claims of Black-Box Optimality
Optimal Membership Inference under Discrete-time SGD
Inverse Hessian Attack
Experiments
Setup
Results
Ablating over terms inside IHA
Comparison with Leave-One-Out Setting
Inter-Attack Agreement
Runtime Comparison
...and 14 more sections

Key Result

Lemma 2.1

Let $\mathcal{T} = \{\bm{z}_2,\ldots,\bm{z}_n, m_2, \ldots, m_n\}$. Given model parameters $\bm{w}$ and a record $\bm{z}_1$, the optimal membership inference is given by: where $\sigma(u) = (1+\exp(-u))^{-1}$ is the Sigmoid function, and $\gamma = \mathbb{P}(m_1 = 1)$.

Figures (1)

Figure 1: ROC curves for low-FPR region for various attacks and datasets.

Theorems & Definitions (6)

Definition 2.1: Membership Inference
Lemma 2.1: sablayrolles_white-box_2019
Theorem 2.2: SGD Stationary distribution with momentum
Theorem 2.3: SGD Noise Covariance
Theorem 3.1: Posterior for SGD
Theorem 3.2: Optimal Membership-Inference Score

Do Parameters Reveal More than Loss for Membership Inference?

TL;DR

Abstract

Do Parameters Reveal More than Loss for Membership Inference?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (6)