Making Alice Appear Like Bob: A Probabilistic Preference Obfuscation Method For Implicit Feedback Recommendation Models
Gustavo Escobedo, Marta Moscati, Peter Muellner, Simone Kopeinik, Dominik Kowald, Elisabeth Lex, Markus Schedl
TL;DR
The paper tackles privacy leakage in implicit-feedback recommender systems where user interactions correlate with protected attributes. It introduces Stereotypicality-Based Obfuscation (SBO), a probabilistic method that reduces item stereotypicality by selectively obfuscating highly stereotype-associated items in user profiles using item- and user-level metrics IGI and I_Ster, with obfuscation controlled by a ratio $\rho$ and sampling guided by Bernoulli trials. SBO is evaluated across three recommender models (BPR-MF, LightGCN, MultVAE) on MovieLens-1M and Last.fm-2b-100k, showing improved privacy (lower attacker accuracy) with only modest drops in utility (NDCG@10), and often outperforming a state-of-the-art obfuscation method Perblur. The work demonstrates that focusing obfuscation on the conjunction of profile items via stereotypicality metrics yields favorable privacy-utility trade-offs, with potential for extensions to more protected attributes and to addressing membership inference tasks in the future.
Abstract
Users' interaction or preference data used in recommender systems carry the risk of unintentionally revealing users' private attributes (e.g., gender or race). This risk becomes particularly concerning when the training data contains user preferences that can be used to infer these attributes, especially if they align with common stereotypes. This major privacy issue allows malicious attackers or other third parties to infer users' protected attributes. Previous efforts to address this issue have added or removed parts of users' preferences prior to or during model training to improve privacy, which often leads to decreases in recommendation accuracy. In this work, we introduce SBO, a novel probabilistic obfuscation method for user preference data designed to improve the accuracy--privacy trade-off for such recommendation scenarios. We apply SBO to three state-of-the-art recommendation models (i.e., BPR, MultVAE, and LightGCN) and two popular datasets (i.e., MovieLens-1M and LFM-2B). Our experiments reveal that SBO outperforms comparable approaches with respect to the accuracy--privacy trade-off. Specifically, we can reduce the leakage of users' protected attributes while maintaining on-par recommendation accuracy.
