Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates

Yuhan Liu; Florent Guepin; Igor Shilov; Yves-Alexandre De Montjoye

Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates

Yuhan Liu, Florent Guepin, Igor Shilov, Yves-Alexandre De Montjoye

TL;DR

The paper tackles privacy auditing for differentially private location aggregates by exposing a gap between theoretical DP-privacy guarantees and empirical MIA performance. It introduces two metric-based MIAs—the one-threshold and two-threshold attacks—showing data-distribution dependent effectiveness: one-threshold excels with Gaussian DP noise, while two-threshold outperforms under Laplace noise. The authors prove that MLP-based MIAs can encode these complex rules given sufficient training data, but typical small-sample MLPs converge to the simpler one-threshold rule, underestimating privacy risk in Laplace-noised settings. Through experiments on real-world Milano data, metric-based MIAs outperform traditional meta-classifier MIAs with modest shadow data, and increasing shadow data enables MLPs to reach comparable performance, suggesting synthetic data and pre-training as practical remedies. Overall, the work provides actionable insights for improving MIA techniques and highlights broader applicability to DP-protected datasets with multiple observations per individual.

Abstract

The widespread collection and sharing of location data, even in aggregated form, raises major privacy concerns. Previous studies used meta-classifier-based membership inference attacks~(MIAs) with multi-layer perceptrons~(MLPs) to estimate privacy risks in location data, including when protected by differential privacy (DP). In this work, however, we show that a significant gap exists between the expected attack accuracy given by DP and the empirical attack accuracy even with informed attackers (also known as DP attackers), indicating a potential underestimation of the privacy risk. To explore the potential causes for the observed gap, we first propose two new metric-based MIAs: the one-threshold attack and the two-threshold attack. We evaluate their performances on real-world location data and find that different data distributions require different attack strategies for optimal performance: the one-threshold attack is more effective with Gaussian DP noise, while the two-threshold attack performs better with Laplace DP noise. Comparing their performance with one of the MLP-based attack models in previous works shows that the MLP only learns the one-threshold rule, leading to a suboptimal performance under the Laplace DP noise and an underestimation of the privacy risk. Second, we theoretically prove that MLPs can encode complex rules~(\eg, the two-threshold attack rule), which can be learned when given a substantial amount of training data. We conclude by discussing the implications of our findings in practice, including broader applications extending beyond location aggregates to any differentially private datasets containing multiple observations per individual and how techniques such as synthetic data generation and pre-training might enable MLP to learn more complex optimal rules.

Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates

TL;DR

Abstract

Paper Structure (37 sections, 5 theorems, 19 equations, 27 figures, 7 algorithms)

This paper contains 37 sections, 5 theorems, 19 equations, 27 figures, 7 algorithms.

Introduction
Backgrounds
Location Data
User Traces
Location Aggregates
Differential Privacy
Definitions and Tools
Differential Privacy on Location Aggregate
Membership Inference Attack against Location Data
Membership Inference Game
Evaluation Metric
Meta-classifier-based MIAs
Metric-based Attack
Overview
Overall Methodology
...and 22 more sections

Key Result

Theorem 1

For any $\varepsilon\geq0$ and $\delta\in[0,1]$, the class of $\left(\varepsilon,\delta\right)$-differentially private mechanisms satisfies $\left(\left(k-2i\right)\varepsilon,1-\left(1-\delta\right)^{k}\left(1-\delta_{i}\right)\right)$-differential privacy under $k$-fold adaptive composition, for a

Figures (27)

Figure 1: Comparison between the expected accuracy and attack accuracy of the typical meta-classifier-based attack (MLP with one hidden layer and 100 nodes) with the informed attacker. Each observation is perturbed with the Laplace mechanism $\mathrm{Lap}(\frac{1}{0.5})$. To compute the expected attack accuracy, we first compute the expected false positive rate ($\alpha$) and false negative rate ($\beta$) of MIA given a DP mechanism under multiple observations as in kairouz2015composition. Then, the expected attack accuracy is derived as $ACC=\frac{1-\alpha}{2}+\frac{1-\beta}{2}$.
Figure 2: An illustration of user traces and their aggregation. Figure \ref{['subfig: traces']} demonstrates a set of individuals $I$, where $t\in E$ is the timestamps of interest (also referred to as an Epoch) and $s\in S$ is the site of interest. Figure \ref{['subfig: aggregates']} is an aggregate of user traces. If the shown trace in Figure \ref{['subfig: traces']} is the target trace, the cells in red in $A$ are positive observations.
Figure 3: The overall pipeline of MIAs with meta-classifier: infer the membership of the target $z$ from a released aggregate $\tilde{A}$ derived from the target dataset $D$ by using an attack model trained on an auxiliary $D_{AUX}$ that follows the same underlying data distribution as $D$.
Figure 4: The overall pipeline of the metric-based MIAs: infer the membership of the target $z$ from a released aggregate $\tilde{A}$ derived from a target dataset $D$ by defining score functions and estimating thresholds using the auxiliary dataset $D_{AUX}$ that follows the same underlying data distribution as $D$. The membership of $z$ is then decided by comparing the score of the released aggregate and the estimated thresholds.
Figure 5: An illustration of two designed score functions for the one-threshold attack (i.e., $\texttt{Score}_1$) and the two-threshold attack (i.e., $\texttt{Score}_2$), respectively, given a target trace $z$ and an aggregate $A$. In particular, we suppose the sub-thresholds for each positive observation are $1.5$ without loss of generality.
...and 22 more figures

Theorems & Definitions (8)

Definition 1: Differential Privacy dwork2006differential
Definition 2: $l_p$-sensitivity
Theorem 1: Optimal Composition kairouz2015composition
Definition 3: Membership Inference Game ye2022enhanced
Theorem 2
Theorem 3
Theorem 4
Theorem 5

Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates

TL;DR

Abstract

Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (27)

Theorems & Definitions (8)