Table of Contents
Fetching ...

Inferring Communities of Interest in Collaborative Learning-based Recommender Systems

Yacine Belal, Sonia Ben Mokhtar, Mohamed Maouche, Anthony Simonet-Boulogne

TL;DR

The paper investigates privacy risks in collaborative-learning-based recommender systems by introducing Community Inference Attack (CIA), a low-cost, comparison-based attack that infers communities of users sharing a target item set. CIA operates in both Federated Recommender Systems (FedRecs) and Gossip Learning-based Recommender Systems (GossipRecs), achieving up to 10x random-guess accuracy in FL and around 3x in GL, without training surrogate models. It evaluates two defenses—Share less and Differentially Private SGD (DP-SGD)—finding that Share less generally improves privacy-utility trade-offs in FedRecs, while it can be counterproductive in GossipRecs due to model aging effects; DP-SGD offers formal privacy at the expense of utility. The results highlight substantial privacy leakage in distributed recommender systems and provide guidance for defense design, including the relative value of Share less over DP-SGD and the potential need for novel protections in decentralized settings.

Abstract

Collaborative-learning-based recommender systems, such as those employing Federated Learning (FL) and Gossip Learning (GL), allow users to train models while keeping their history of liked items on their devices. While these methods were seen as promising for enhancing privacy, recent research has shown that collaborative learning can be vulnerable to various privacy attacks. In this paper, we propose a novel attack called Community Inference Attack (CIA), which enables an adversary to identify community members based on a set of target items. What sets CIA apart is its efficiency: it operates at low computational cost by eliminating the need for training surrogate models. Instead, it uses a comparison-based approach, inferring sensitive information by comparing users' models rather than targeting any specific individual model. To evaluate the effectiveness of CIA, we conduct experiments on three real-world recommendation datasets using two recommendation models under both Federated and Gossip-like settings. The results demonstrate that CIA can be up to 10 times more accurate than random guessing. Additionally, we evaluate two mitigation strategies: Differentially Private Stochastic Gradient Descent (DP-SGD) and a Share less policy, which involves sharing fewer, less sensitive model parameters. Our findings suggest that the Share less strategy offers a better privacy-utility trade-off, especially in GL.

Inferring Communities of Interest in Collaborative Learning-based Recommender Systems

TL;DR

The paper investigates privacy risks in collaborative-learning-based recommender systems by introducing Community Inference Attack (CIA), a low-cost, comparison-based attack that infers communities of users sharing a target item set. CIA operates in both Federated Recommender Systems (FedRecs) and Gossip Learning-based Recommender Systems (GossipRecs), achieving up to 10x random-guess accuracy in FL and around 3x in GL, without training surrogate models. It evaluates two defenses—Share less and Differentially Private SGD (DP-SGD)—finding that Share less generally improves privacy-utility trade-offs in FedRecs, while it can be counterproductive in GossipRecs due to model aging effects; DP-SGD offers formal privacy at the expense of utility. The results highlight substantial privacy leakage in distributed recommender systems and provide guidance for defense design, including the relative value of Share less over DP-SGD and the potential need for novel protections in decentralized settings.

Abstract

Collaborative-learning-based recommender systems, such as those employing Federated Learning (FL) and Gossip Learning (GL), allow users to train models while keeping their history of liked items on their devices. While these methods were seen as promising for enhancing privacy, recent research has shown that collaborative learning can be vulnerable to various privacy attacks. In this paper, we propose a novel attack called Community Inference Attack (CIA), which enables an adversary to identify community members based on a set of target items. What sets CIA apart is its efficiency: it operates at low computational cost by eliminating the need for training surrogate models. Instead, it uses a comparison-based approach, inferring sensitive information by comparing users' models rather than targeting any specific individual model. To evaluate the effectiveness of CIA, we conduct experiments on three real-world recommendation datasets using two recommendation models under both Federated and Gossip-like settings. The results demonstrate that CIA can be up to 10 times more accurate than random guessing. Additionally, we evaluate two mitigation strategies: Differentially Private Stochastic Gradient Descent (DP-SGD) and a Share less policy, which involves sharing fewer, less sensitive model parameters. Our findings suggest that the Share less strategy offers a better privacy-utility trade-off, especially in GL.
Paper Structure (43 sections, 6 equations, 5 figures, 9 tables, 2 algorithms)

This paper contains 43 sections, 6 equations, 5 figures, 9 tables, 2 algorithms.

Figures (5)

  • Figure 1: CIA run targeting "health vulnerable" users in the Foursquare dataset.
  • Figure 2: Community Inference Attack (CIA) : a comparison between FL and GL settings.
  • Figure 3: Attack Accuracy and Hit Ratio@20 trade-off summary for the full models and Share less strategies on GMF.
  • Figure 4: Attack Accuracy and F1-Score trade-off summary for the full models and Share less strategies on PRME.
  • Figure 5: Average utility and empirical privacy trade-off on Movielens under DP-SGD with different values of privacy budget $\epsilon$, $\delta = 1e^{-6}$ and clipping = 2. Utility is measured in Hit Ratio.