Table of Contents
Fetching ...

Better Membership Inference Privacy Measurement through Discrepancy

Ruihan Wu, Pengrun Huang, Kamalika Chaudhuri

TL;DR

This work introduces a discrepancy-distance based empirical privacy metric that upper-bounds the advantage of score-based Membership Inference Attacks, enabling scalable privacy assessment for large, well-generalized models without training multiple shadow models. It formalizes the bound via convex discriminative sets, proving Adv$(m; f, S, abla D)\le D_{ abla Q}(S, abla D)$ for common MIAs, and provides a practical approximation CPM (Convex Polytope Machine) using a polytope surrogate loss. Empirically, CPM consistently upper-bounds standard MIAs on CIFAR and ImageNet-scale models, with performance improving as the facet count $K$ grows, and revealing that traditional scores may overfit to standard training recipes. To address modern models trained with sophisticated procedures, the authors propose training-procedure aware MIAs, such as MixUp-score and RelaxLoss-score, which achieve higher leakage when aligned with the training method. Overall, the discrepancy-based metric offers a scalable, stronger privacy evaluation tool, while the MixUp and RelaxLoss scores illustrate the potential for procedure-aware MIAs on contemporary models.

Abstract

Membership Inference Attacks have emerged as a dominant method for empirically measuring privacy leakage from machine learning models. Here, privacy is measured by the {\em{advantage}} or gap between a score or a function computed on the training and the test data. A major barrier to the practical deployment of these attacks is that they do not scale to large well-generalized models -- either the advantage is relatively low, or the attack involves training multiple models which is highly compute-intensive. In this work, inspired by discrepancy theory, we propose a new empirical privacy metric that is an upper bound on the advantage of a family of membership inference attacks. We show that this metric does not involve training multiple models, can be applied to large Imagenet classification models in-the-wild, and has higher advantage than existing metrics on models trained with more recent and sophisticated training recipes. Motivated by our empirical results, we also propose new membership inference attacks tailored to these training losses.

Better Membership Inference Privacy Measurement through Discrepancy

TL;DR

This work introduces a discrepancy-distance based empirical privacy metric that upper-bounds the advantage of score-based Membership Inference Attacks, enabling scalable privacy assessment for large, well-generalized models without training multiple shadow models. It formalizes the bound via convex discriminative sets, proving Adv for common MIAs, and provides a practical approximation CPM (Convex Polytope Machine) using a polytope surrogate loss. Empirically, CPM consistently upper-bounds standard MIAs on CIFAR and ImageNet-scale models, with performance improving as the facet count grows, and revealing that traditional scores may overfit to standard training recipes. To address modern models trained with sophisticated procedures, the authors propose training-procedure aware MIAs, such as MixUp-score and RelaxLoss-score, which achieve higher leakage when aligned with the training method. Overall, the discrepancy-based metric offers a scalable, stronger privacy evaluation tool, while the MixUp and RelaxLoss scores illustrate the potential for procedure-aware MIAs on contemporary models.

Abstract

Membership Inference Attacks have emerged as a dominant method for empirically measuring privacy leakage from machine learning models. Here, privacy is measured by the {\em{advantage}} or gap between a score or a function computed on the training and the test data. A major barrier to the practical deployment of these attacks is that they do not scale to large well-generalized models -- either the advantage is relatively low, or the attack involves training multiple models which is highly compute-intensive. In this work, inspired by discrepancy theory, we propose a new empirical privacy metric that is an upper bound on the advantage of a family of membership inference attacks. We show that this metric does not involve training multiple models, can be applied to large Imagenet classification models in-the-wild, and has higher advantage than existing metrics on models trained with more recent and sophisticated training recipes. Motivated by our empirical results, we also propose new membership inference attacks tailored to these training losses.
Paper Structure (22 sections, 3 theorems, 15 equations, 6 figures, 2 tables)

This paper contains 22 sections, 3 theorems, 15 equations, 6 figures, 2 tables.

Key Result

Proposition 1

For any MIA $m$, if $Q_m\in \mathcal{Q}$, ${\rm Adv}(m; f, S, \mathcal{D})\leq D_{\mathcal{Q}}(S, \mathcal{D})$.

Figures (6)

  • Figure 1: CPM and the advantage of baselines on models trained on CIFAR-10 \ref{['fig:CIFAR_10']}, CIFAR-100 \ref{['fig:CIFAR_100']}, Texas \ref{['fig:Texas']} and Purchase \ref{['fig:Purchase']}. As shown in the figures, CPM is an upper bound to the advantage of the baseline scores for most models.
  • Figure 2: CPM with different numbers of facets $K$. The figures show CPM achieves a higher advantage than the existing MIAs as an uppper bound with a moderate value of $K$.
  • Figure 3: CPM and the advantage of baseline scores on PyTorch models. It shows that CPM is very close to the advantage of the baseline scores for the V1 models, but the gap is significantly larger for the V2 models.
  • Figure 4: The advantage on CIFAR-10 when we find CPM with different numbers of facets $K$.
  • Figure 5: The advantage on Texas when we find CPM with different numbers of facets $K$.
  • ...and 1 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Theorem 1
  • proof : Proof sketch of Theorem \ref{['thm:cvx_dis_up']}
  • Proposition 2
  • proof : Proof of Theorem \ref{['thm:cvx_dis_up']}.
  • proof : Proof of Proposition \ref{['thm:cvx_up2']}