Membership Privacy Risks of Sharpness Aware Minimization

Young In Kim; Andrea Agiollo; Pratiksha Agrawal; Johannes O. Royset; Rajiv Khanna

Membership Privacy Risks of Sharpness Aware Minimization

Young In Kim, Andrea Agiollo, Pratiksha Agrawal, Johannes O. Royset, Rajiv Khanna

TL;DR

This work challenges the intuition that flatter minima necessarily enhance membership privacy by showing that Sharpness-Aware Minimization (SAM) can increase membership inference risk even as it improves generalization. The authors combine empirical analyses of memorization and influence with a theoretical model of a mixture data distribution to explain why SAM's emphasis on atypical subclass patterns improves generalization but heightens privacy vulnerability. They introduce memorization and influence metrics, demonstrate that SAM memorizes more atypical sub-patterns and amplifies mid-to-high memorization samples, and provide a formal result showing higher minority-subclass alignment can yield both better generalization and higher MIA risk. The findings highlight a nuanced privacy-generalization trade-off in flat-minima optimization and motivate privacy-aware approaches for training robust models with constrained leakage.

Abstract

Optimization algorithms that seek flatter minima such as Sharpness-Aware Minimization (SAM) are widely credited with improved generalization. We ask whether such gains impact membership privacy. Surprisingly, we find that SAM is more prone to membership inference attacks than classical SGD across multiple datasets and attack methods, despite achieving lower test error. This is an intriguing phenomenon as conventional belief posits that higher membership privacy risk is associated with poor generalization. We conjecture that SAM is capable of memorizing atypical subpatterns more, leading to better generalization but higher privacy risk. We empirically validate our hypothesis by running extensive analysis on memorization and influence scores. Finally, we theoretically show how a model that captures minority subclass features more can effectively generalize better \emph{and} have higher membership privacy risk.

Membership Privacy Risks of Sharpness Aware Minimization

TL;DR

Abstract

Paper Structure (47 sections, 6 theorems, 54 equations, 6 figures, 3 tables)

This paper contains 47 sections, 6 theorems, 54 equations, 6 figures, 3 tables.

Introduction
Contributions
Background & Preliminaries
Memorization & Influence scores
Sharpness Aware Minimization (SAM)
Membership Inference attacks
Privacy Risks of SAM
Datasets
Methods
Results
SAM learns Atypical Subclass Features More
SAM Memorizes Atypical Sub-patterns More
SAM Increases Influence of High Memorization Samples
SAM's Generalization Gain Comes From Higher Memorization of Sub-patterns
Summary of experimental findings
...and 32 more sections

Key Result

Theorem 1

Let $\mathbf{w}^{(A)},\mathbf{w}^{(B)}$ be two interpolating solutions trained on the same $D$. Under def:datamodel and regulatory conditions, if $\mathbf{w}^{(A)} \;\overset{\mathrm{MSA}}{\succcurlyeq}\; \mathbf{w}^{(B)}$, then with strict inequality if $F_B((A^{(B)},A^{(A)}])>0$.

Figures (6)

Figure 1: (a): Memorization score density plot for SAM vs SGD. SAM has less density in the lowest range, but more density spread evenly across the remaining range. (b): Memorization scores of CIFAR-100 training samples under SAM and SGD. The regression curve (in green) shows a consistent deviation from the identity line (in red), indicating that SAM memorizes a larger subset of samples in the lower score range which are likely to be atypical subclass samples. (c): Visualization of samples more memorized by SAM for the tiger class following the same setting of (b).
Figure 2: (a) and (b): Distribution of the influence scores of the 20 training samples achieving the highest influence score over each memorization interval over their memorization scores for SGD (a) and SAM (b). The regression analysis (green lines) shows that SAM maintains a smoother influence distribution, amplifying mid-to-high memorization samples (subclass features), while SGD relies more heavily on a narrow set of highly memorized points (noise). (c) Difference in influence scores between SAM and SGD as a function of memorization score differences. SAM downweights low-memorized samples and selectively amplifies the influence of mid-to-high memorization samples.
Figure 3: Test images (boxed) from buckets 1 and 5 and their respective top-10 influential training images. For each object the top row is an image from bucket 1 and the bottom row is an image from bucket 5. For bucket 1 images (higher memorization,top row), notice that the images are atypical for their classes, and there is a near duplicate in the training data that was important for generalizing on this test image. For bucket 5 images, on the other hand, the top influential images are reminiscent of the test image at a conceptual level.
Figure 4: (a): Test accuracy on $\mathcal{I}_{ent}$ groups as evaluated by equation \ref{['entr']}. SAM's performance gains comes from it correctly predicting more atypical data points that need memorization of atypical sub-patterns to be classified correctly. (b) and (c): Distribution of top-1 most influential training point's memorization scores for $\mathcal{I}_{ent}$ buckets 1 and 5. Testing samples falling in the lower (higher) numbered buckets are influenced by training points with higher (lower) memorization.
Figure 5: A synthetic construction illustrating the generalization ability of SAM over SGD for atypical examples. Fig (a) shows class density contours of a two-class, 2-dimensional classification problem, along with the Bayes Optimal solution. The red class has two 'clusters', one representing typical examples and one representing atypical examples. Fig (b) shows an instance of data sampled from densities shown in (a); the larger cluster of red dots represent typical examples in the red class, and the red '+' points represent a lot fewer atypical examples. SAM generalizes better than SGD in this case. Fig (c) shows that if there are enough samples generated from both typical and atypical clusters, SAM and SGD coincide with the Bayes Optimal classifier.
...and 1 more figures

Theorems & Definitions (17)

Definition 1: Data Model
Definition 2: Minority Subclass Alignment Order
Definition 3: Confidence-threshold attacker
Theorem 1: Higher MSA $\implies$ Better Generalization
Theorem 2: Higher MSA $\implies$ Higher Attacker's Advantage
Lemma 1: High-dimensional near-orthogonality
proof
Remark 1: Distribution $F_B$ and why $F_B(A)$ appears
Definition 4: Generalization gap
Lemma 2: Generalization gap of majority subclass samples
...and 7 more

Membership Privacy Risks of Sharpness Aware Minimization

TL;DR

Abstract

Membership Privacy Risks of Sharpness Aware Minimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (17)