Table of Contents
Fetching ...

Stealthy Attack on Large Language Model based Recommendation

Jinghao Zhang, Yuting Liu, Qiang Liu, Shu Wu, Guibing Guo, Liang Wang

TL;DR

This work shows that large language model–based recommender systems, which rely heavily on textual item descriptions, are vulnerable to stealthy text-based attacks that boost exposure of target items during testing without retraining. The authors introduce model-agnostic text manipulations (word insertion and GPT-based rewriting) and analyze black-box text attacks across four victim models and three Amazon datasets, highlighting transferability across tasks and models. A simple rewriting defense is proposed, capable of mitigating character-level attacks while preserving overall recommendation performance. The findings establish a critical security gap in LLM-based RS and motivate development of robust defenses and attack-aware evaluation protocols for deployment in real-world systems.

Abstract

Recently, the powerful large language models (LLMs) have been instrumental in propelling the progress of recommender systems (RS). However, while these systems have flourished, their susceptibility to security threats has been largely overlooked. In this work, we reveal that the introduction of LLMs into recommendation models presents new security vulnerabilities due to their emphasis on the textual content of items. We demonstrate that attackers can significantly boost an item's exposure by merely altering its textual content during the testing phase, without requiring direct interference with the model's training process. Additionally, the attack is notably stealthy, as it does not affect the overall recommendation performance and the modifications to the text are subtle, making it difficult for users and platforms to detect. Our comprehensive experiments across four mainstream LLM-based recommendation models demonstrate the superior efficacy and stealthiness of our approach. Our work unveils a significant security gap in LLM-based recommendation systems and paves the way for future research on protecting these systems.

Stealthy Attack on Large Language Model based Recommendation

TL;DR

This work shows that large language model–based recommender systems, which rely heavily on textual item descriptions, are vulnerable to stealthy text-based attacks that boost exposure of target items during testing without retraining. The authors introduce model-agnostic text manipulations (word insertion and GPT-based rewriting) and analyze black-box text attacks across four victim models and three Amazon datasets, highlighting transferability across tasks and models. A simple rewriting defense is proposed, capable of mitigating character-level attacks while preserving overall recommendation performance. The findings establish a critical security gap in LLM-based RS and motivate development of robust defenses and attack-aware evaluation protocols for deployment in real-world systems.

Abstract

Recently, the powerful large language models (LLMs) have been instrumental in propelling the progress of recommender systems (RS). However, while these systems have flourished, their susceptibility to security threats has been largely overlooked. In this work, we reveal that the introduction of LLMs into recommendation models presents new security vulnerabilities due to their emphasis on the textual content of items. We demonstrate that attackers can significantly boost an item's exposure by merely altering its textual content during the testing phase, without requiring direct interference with the model's training process. Additionally, the attack is notably stealthy, as it does not affect the overall recommendation performance and the modifications to the text are subtle, making it difficult for users and platforms to detect. Our comprehensive experiments across four mainstream LLM-based recommendation models demonstrate the superior efficacy and stealthiness of our approach. Our work unveils a significant security gap in LLM-based recommendation systems and paves the way for future research on protecting these systems.
Paper Structure (38 sections, 1 equation, 4 figures, 22 tables, 1 algorithm)

This paper contains 38 sections, 1 equation, 4 figures, 22 tables, 1 algorithm.

Figures (4)

  • Figure 1: The proposed text attack paradigm on LLM-based RS model. Malicious attackers modify the titles of target items to mislead RS models to rank them higher. The attack is highly stealthy since the modification is subtle and overall recommendation performance is almost unchanged.
  • Figure 2: Performance comparison of different attacks on RecFormer. The size of the scatter points represents the cosine semantic similarity with the original title, with larger points indicating better semantic preservation (best viewed in color).
  • Figure 3: Performance comparison of different attacks on RecFormer. The size of the scatter points represents the cosine semantic similarity with the original title, with larger points indicating better semantic preservation. (best viewed in color).
  • Figure 4: Performance comparison of different attacks on various models. The size of the scatter points represents the cosine semantic similarity with the original title, with larger points indicating better semantic preservation (best viewed in color).