Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems
Fatemeh Nazary, Yashar Deldjoo, Tommaso di Noia
TL;DR
The paper addresses the vulnerability of retrieval-augmented generation (RAG) based recommender systems to adversarial data poisoning through item metadata manipulation. It proposes Poison-RAG, a black-box attack framework that selects adversarial tags to promote long-tail items and demote popular ones by optimizing exposure across users, with the objective framed as $\max_{\Gamma} \frac{1}{|\mathcal{U}|} \sum_{u \in \mathcal{U}} \left[ \sum_{i \in R'(u) \cap \mathcal{I}_L} Exposure(i,u) - \sum_{j \in R'(u) \cap \mathcal{I}_P} Exposure(j,u) \right]$. The tag scoring combines adversarial impact and semantic relevance as $A'(t,i) = A(t) \cdot s(t,i)$, where $A(t) = \log \left( \frac{P(t|c_{target})}{P(t|c_{orig}) + \epsilon} \right)$ and $s(t,i) = \frac{e_t^T e_i}{\|e_t\| \|e_i\|}$ with embeddings from pre-trained models. Experiments on MovieLens with data enrichment show that local, item-specific tagging effectively shifts exposure toward long-tail items, while global tagging can inadvertently boost popular items; data augmentation provides partial defense, underscoring the need for robust metadata management in RAG-based recommendations.
Abstract
This study presents Poison-RAG, a framework for adversarial data poisoning attacks targeting retrieval-augmented generation (RAG)-based recommender systems. Poison-RAG manipulates item metadata, such as tags and descriptions, to influence recommendation outcomes. Using item metadata generated through a large language model (LLM) and embeddings derived via the OpenAI API, we explore the impact of adversarial poisoning attacks on provider-side, where attacks are designed to promote long-tail items and demote popular ones. Two attack strategies are proposed: local modifications, which personalize tags for each item using BERT embeddings, and global modifications, applying uniform tags across the dataset. Experiments conducted on the MovieLens dataset in a black-box setting reveal that local strategies improve manipulation effectiveness by up to 50\%, while global strategies risk boosting already popular items. Results indicate that popular items are more susceptible to attacks, whereas long-tail items are harder to manipulate. Approximately 70\% of items lack tags, presenting a cold-start challenge; data augmentation and synthesis are proposed as potential defense mechanisms to enhance RAG-based systems' resilience. The findings emphasize the need for robust metadata management to safeguard recommendation frameworks. Code and data are available at https://github.com/atenanaz/Poison-RAG.
