Some Targets Are Harder to Identify than Others: Quantifying the Target-dependent Membership Leakage
Achraf Azize, Debabrota Basu
TL;DR
This work formalizes fixed-target Membership Inference (MI) games to quantify how target points differ in leakage, revealing that target hardness scales with the Mahalanobis distance $\| z^\star - \mu \|_{C_\sigma^{-1}}$. It derives exact asymptotics for the empirical-mean MI, showing the leakage and attack power depend on $m^\star = \lim_{n,d} \frac{1}{n} \| z^\star - \mu \|^2_{C_\sigma^{-1}}$ via LR testing, and connects to prior tracing results through an Edgeworth-Lindeberg analysis. The paper also quantifies privacy-defence effects (Gaussian noise, sub-sampling) and target-misspecification, and introduces a covariance-based LR attack along with a Mahalanobis-based canary selection strategy for privacy auditing in white-box federated learning. Experimental results on synthetic data and real datasets validate the theory and demonstrate that the covariance attack can outperform scalar-product baselines, with the Mahalanobis score explaining fixed-target MI hardness and guiding efficient canary selection. Overall, the work provides a principled, geometry-aware framework for auditing and strengthening privacy in MI contexts and outlines directions to extend the analysis to Z-estimators and influence-function-based attacks.
Abstract
In a Membership Inference (MI) game, an attacker tries to infer whether a target point was included or not in the input of an algorithm. Existing works show that some target points are easier to identify, while others are harder. This paper explains the target-dependent hardness of membership attacks by studying the powers of the optimal attacks in a fixed-target MI game. We characterise the optimal advantage and trade-off functions of attacks against the empirical mean in terms of the Mahalanobis distance between the target point and the data-generating distribution. We further derive the impacts of two privacy defences, i.e. adding Gaussian noise and sub-sampling, and that of target misspecification on optimal attacks. As by-products of our novel analysis of the Likelihood Ratio (LR) test, we provide a new covariance attack which generalises and improves the scalar product attack. Also, we propose a new optimal canary-choosing strategy for auditing privacy in the white-box federated learning setting. Our experiments validate that the Mahalanobis score explains the hardness of fixed-target MI games.
