Table of Contents
Fetching ...

Evaluating Impact of User-Cluster Targeted Attacks in Matrix Factorisation Recommenders

Sulthana Shams, Douglas Leith

TL;DR

This paper addresses the problem of user-cluster targeted data poisoning in matrix factorisation–based recommender systems by injecting fake users to promote a chosen item within a target cluster. It compares two update regimes, derives analytical expressions for how the latent matrices $U$ and $V$ evolve under attack, and shows that updates to the item vector $V_{j^*}$ drive cross-cluster leakage more than updates to $U$. The study finds that items with few ratings in the target cluster are more susceptible and that leakage to non-target clusters depends on feature correlations; results are validated on MovieLens and Goodreads using synthetic data, demonstrating practical implications for RS robustness. The authors discuss defence strategies, such as augmenting true ratings, decoupling update frequencies, and monitoring shifts in item vectors, to mitigate targeted data poisoning risks.

Abstract

In practice, users of a Recommender System (RS) fall into a few clusters based on their preferences. In this work, we conduct a systematic study on user-cluster targeted data poisoning attacks on Matrix Factorisation (MF) based RS, where an adversary injects fake users with falsely crafted user-item feedback to promote an item to a specific user cluster. We analyse how user and item feature matrices change after data poisoning attacks and identify the factors that influence the effectiveness of the attack on these feature matrices. We demonstrate that the adversary can easily target specific user clusters with minimal effort and that some items are more susceptible to attacks than others. Our theoretical analysis has been validated by the experimental results obtained from two real-world datasets. Our observations from the study could serve as a motivating point to design a more robust RS.

Evaluating Impact of User-Cluster Targeted Attacks in Matrix Factorisation Recommenders

TL;DR

This paper addresses the problem of user-cluster targeted data poisoning in matrix factorisation–based recommender systems by injecting fake users to promote a chosen item within a target cluster. It compares two update regimes, derives analytical expressions for how the latent matrices and evolve under attack, and shows that updates to the item vector drive cross-cluster leakage more than updates to . The study finds that items with few ratings in the target cluster are more susceptible and that leakage to non-target clusters depends on feature correlations; results are validated on MovieLens and Goodreads using synthetic data, demonstrating practical implications for RS robustness. The authors discuss defence strategies, such as augmenting true ratings, decoupling update frequencies, and monitoring shifts in item vectors, to mitigate targeted data poisoning risks.

Abstract

In practice, users of a Recommender System (RS) fall into a few clusters based on their preferences. In this work, we conduct a systematic study on user-cluster targeted data poisoning attacks on Matrix Factorisation (MF) based RS, where an adversary injects fake users with falsely crafted user-item feedback to promote an item to a specific user cluster. We analyse how user and item feature matrices change after data poisoning attacks and identify the factors that influence the effectiveness of the attack on these feature matrices. We demonstrate that the adversary can easily target specific user clusters with minimal effort and that some items are more susceptible to attacks than others. Our theoretical analysis has been validated by the experimental results obtained from two real-world datasets. Our observations from the study could serve as a motivating point to design a more robust RS.
Paper Structure (48 sections, 31 equations, 17 figures, 5 tables)

This paper contains 48 sections, 31 equations, 17 figures, 5 tables.

Figures (17)

  • Figure 1: Plot showing the percentage of fake users entering per cluster when targeting each cluster $g$ for ML and GR Datasets respectively using distinguisher filler items
  • Figure 2: Plot comparing the change in predicted rating in target cluster against the increasing ratio of true users $(n)$ to fake users $(m)$ in the target cluster ($\frac{n}{m}$)
  • Figure 3: Plot comparing the change in the predicted rating of an item in the target cluster against correlation values between the target item and the other items
  • Figure 4: Plot comparing the change in the predicted rating of the target item across clusters when cluster 2 is targeted
  • Figure 5: Plot comparing the relative change in the predicted rating of target item against increasing ratio $\frac{N_t}{N_f}$ in the target cluster
  • ...and 12 more figures