Table of Contents
Fetching ...

Where have you been? A Study of Privacy Risk for Point-of-Interest Recommendation

Kunlin Cai, Jinghuai Zhang, Zhiqing Hong, Will Shand, Guang Wang, Desheng Zhang, Jianfeng Chi, Yuan Tian

TL;DR

This work tackles privacy risks in neural POI-recommendation models trained on mobility data by introducing a privacy attack suite that combines data extraction and membership inference tailored to location-and-trajectory data. The four attacks—LocExtract, TrajExtract, LocMIA, and TrajMIA—are evaluated on two real-world datasets (FourSquare and Gowalla) across multiple POI models (GETNext, LSTPM, RNN), revealing substantial leakage driven by memorization and data outliers. The study also examines how attack design choices, data characteristics, and model utility influence leakage, and it tests defenses (including DP-based and selective protections) that provide partial mitigation but fail to offer a universal solution with acceptable utility. Overall, the paper underscores the need for targeted, privacy-aware defenses in POI recommendations and provides a publicly available framework for auditing privacy in spatio-temporal ML systems.

Abstract

As location-based services (LBS) have grown in popularity, more human mobility data has been collected. The collected data can be used to build machine learning (ML) models for LBS to enhance their performance and improve overall experience for users. However, the convenience comes with the risk of privacy leakage since this type of data might contain sensitive information related to user identities, such as home/work locations. Prior work focuses on protecting mobility data privacy during transmission or prior to release, lacking the privacy risk evaluation of mobility data-based ML models. To better understand and quantify the privacy leakage in mobility data-based ML models, we design a privacy attack suite containing data extraction and membership inference attacks tailored for point-of-interest (POI) recommendation models, one of the most widely used mobility data-based ML models. These attacks in our attack suite assume different adversary knowledge and aim to extract different types of sensitive information from mobility data, providing a holistic privacy risk assessment for POI recommendation models. Our experimental evaluation using two real-world mobility datasets demonstrates that current POI recommendation models are vulnerable to our attacks. We also present unique findings to understand what types of mobility data are more susceptible to privacy attacks. Finally, we evaluate defenses against these attacks and highlight future directions and challenges. Our attack suite is released at https://github.com/KunlinChoi/POIPrivacy.

Where have you been? A Study of Privacy Risk for Point-of-Interest Recommendation

TL;DR

This work tackles privacy risks in neural POI-recommendation models trained on mobility data by introducing a privacy attack suite that combines data extraction and membership inference tailored to location-and-trajectory data. The four attacks—LocExtract, TrajExtract, LocMIA, and TrajMIA—are evaluated on two real-world datasets (FourSquare and Gowalla) across multiple POI models (GETNext, LSTPM, RNN), revealing substantial leakage driven by memorization and data outliers. The study also examines how attack design choices, data characteristics, and model utility influence leakage, and it tests defenses (including DP-based and selective protections) that provide partial mitigation but fail to offer a universal solution with acceptable utility. Overall, the paper underscores the need for targeted, privacy-aware defenses in POI recommendations and provides a publicly available framework for auditing privacy in spatio-temporal ML systems.

Abstract

As location-based services (LBS) have grown in popularity, more human mobility data has been collected. The collected data can be used to build machine learning (ML) models for LBS to enhance their performance and improve overall experience for users. However, the convenience comes with the risk of privacy leakage since this type of data might contain sensitive information related to user identities, such as home/work locations. Prior work focuses on protecting mobility data privacy during transmission or prior to release, lacking the privacy risk evaluation of mobility data-based ML models. To better understand and quantify the privacy leakage in mobility data-based ML models, we design a privacy attack suite containing data extraction and membership inference attacks tailored for point-of-interest (POI) recommendation models, one of the most widely used mobility data-based ML models. These attacks in our attack suite assume different adversary knowledge and aim to extract different types of sensitive information from mobility data, providing a holistic privacy risk assessment for POI recommendation models. Our experimental evaluation using two real-world mobility datasets demonstrates that current POI recommendation models are vulnerable to our attacks. We also present unique findings to understand what types of mobility data are more susceptible to privacy attacks. Finally, we evaluate defenses against these attacks and highlight future directions and challenges. Our attack suite is released at https://github.com/KunlinChoi/POIPrivacy.
Paper Structure (38 sections, 6 equations, 22 figures, 4 tables, 4 algorithms)

This paper contains 38 sections, 6 equations, 22 figures, 4 tables, 4 algorithms.

Figures (22)

  • Figure 1: Our attack suite highlights the privacy concerns in POI recommendation models. In particular, we demonstrate that an adversary can extract or infer membership information of locations or trajectories in the training dataset.
  • Figure 2: Attack performance of data extraction attacks (LocExtract and TrajExtract) on three victim models and two mobility datasets.
  • Figure 3: Attack performance of (LocMIA and TrajMIA) on three victim models and two POI recommendation datasets. The diagonal line indicates the random guess baseline.
  • Figure 4: How location-level aggregate statistics are related to LocMIA. Locations visited by fewer different users or have fewer surrounding check-ins are more vulnerable to LocMIA.
  • Figure 5: How user-level aggregate statistics are related to TrajMIA. x-axis: Percentile categorizes users/locations/trajectories into different groups according to their feature values. y-axis:$\Lambda$ indicates the (averaged) likelihood ratio of training trajectories/locations being the member over non-member from the hypothesis test for each group, with a higher value indicating the larger vulnerability. The users with fewer total check-ins, fewer unique POIs, and fewer or shorter trajectories are more vulnerable to TrajMIA. (4sq)
  • ...and 17 more figures

Theorems & Definitions (2)

  • Definition 1: $(\epsilon,\delta)$-DP
  • Definition 2: geo-indistinguishability