Table of Contents
Fetching ...

Knock Knock, Who's There? Membership Inference on Aggregate Location Data

Apostolos Pyrgelis, Carmela Troncoso, Emiliano De Cristofaro

TL;DR

This work formalizes membership inference on aggregate location time-series as a distinguishability game and demonstrates that an adversary with realistic prior knowledge can accurately infer whether a target contributed to released aggregates, especially when groups are small or mobility patterns are regular. The authors instantiate the distinguishing function as ML classifiers, evaluate on two real mobility datasets (TFL and SFC), and quantify privacy loss via AUC-based metrics. They also study differential privacy defenses, showing substantial privacy gains in passive settings but notable reductions in protection when attackers mimic the defense using noisy data, all at a cost to utility. The paper provides a practical methodology for providers and regulators to assess privacy risks before data release and to compare defense strategies in real-world, continual-release settings.

Abstract

Aggregate location data is often used to support smart services and applications, e.g., generating live traffic maps or predicting visits to businesses. In this paper, we present the first study on the feasibility of membership inference attacks on aggregate location time-series. We introduce a game-based definition of the adversarial task, and cast it as a classification problem where machine learning can be used to distinguish whether or not a target user is part of the aggregates. We empirically evaluate the power of these attacks on both raw and differentially private aggregates using two mobility datasets. We find that membership inference is a serious privacy threat, and show how its effectiveness depends on the adversary's prior knowledge, the characteristics of the underlying location data, as well as the number of users and the timeframe on which aggregation is performed. Although differentially private mechanisms can indeed reduce the extent of the attacks, they also yield a significant loss in utility. Moreover, a strategic adversary mimicking the behavior of the defense mechanism can greatly limit the protection they provide. Overall, our work presents a novel methodology geared to evaluate membership inference on aggregate location data in real-world settings and can be used by providers to assess the quality of privacy protection before data release or by regulators to detect violations.

Knock Knock, Who's There? Membership Inference on Aggregate Location Data

TL;DR

This work formalizes membership inference on aggregate location time-series as a distinguishability game and demonstrates that an adversary with realistic prior knowledge can accurately infer whether a target contributed to released aggregates, especially when groups are small or mobility patterns are regular. The authors instantiate the distinguishing function as ML classifiers, evaluate on two real mobility datasets (TFL and SFC), and quantify privacy loss via AUC-based metrics. They also study differential privacy defenses, showing substantial privacy gains in passive settings but notable reductions in protection when attackers mimic the defense using noisy data, all at a cost to utility. The paper provides a practical methodology for providers and regulators to assess privacy risks before data release and to compare defense strategies in real-world, continual-release settings.

Abstract

Aggregate location data is often used to support smart services and applications, e.g., generating live traffic maps or predicting visits to businesses. In this paper, we present the first study on the feasibility of membership inference attacks on aggregate location time-series. We introduce a game-based definition of the adversarial task, and cast it as a classification problem where machine learning can be used to distinguish whether or not a target user is part of the aggregates. We empirically evaluate the power of these attacks on both raw and differentially private aggregates using two mobility datasets. We find that membership inference is a serious privacy threat, and show how its effectiveness depends on the adversary's prior knowledge, the characteristics of the underlying location data, as well as the number of users and the timeframe on which aggregation is performed. Although differentially private mechanisms can indeed reduce the extent of the attacks, they also yield a significant loss in utility. Moreover, a strategic adversary mimicking the behavior of the defense mechanism can greatly limit the protection they provide. Overall, our work presents a novel methodology geared to evaluate membership inference on aggregate location data in real-world settings and can be used by providers to assess the quality of privacy protection before data release or by regulators to detect violations.

Paper Structure

This paper contains 24 sections, 7 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Distinguishability Game (DG) between adversary Adv and challenger Ch, capturing membership inference over aggregate location time-series. The game is parameterized by the set of users ($U$), the aggregation group size ($m$) and the inference period ($T_I$).
  • Figure 2: Subset of Locations prior (TFL, $\alpha = 0.11$, $|T_I| = 168$) -- Adv's performance for different values of $m$.
  • Figure 3: Subset of Locations prior - Privacy Loss (PL) for different values of $m$.
  • Figure 4: Subset of Locations prior (SFC, $\alpha = 0.2$, $|T_I|=168$) -- Adv's performance for different values of $m$.
  • Figure 5: Same Groups as Released prior (TFL, 75%-25% split, $\beta=150$, $|T_I|=168$) -- Adv's performance for different values of $m$.
  • ...and 8 more figures