Table of Contents
Fetching ...

The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)

Shenglai Zeng, Jiankun Zhang, Pengfei He, Yue Xing, Yiding Liu, Han Xu, Jie Ren, Shuaiqiang Wang, Dawei Yin, Yi Chang, Jiliang Tang

TL;DR

The paper investigates privacy risks in retrieval-augmented generation (RAG) systems, revealing that adversaries can extract private data from external retrieval databases via composite prompts while retrieval augmentation can reduce memorization leakage from LLM training data. It introduces targeted and untargeted attack frameworks on retrieval data and training data, evaluates them on Enron and HealthcareMagic datasets, and analyzes mitigation strategies including re-ranking, summarization, and set-distance thresholds. Key findings show substantial leakage from retrieval data under certain prompts, limited mitigation from reranking, and that retrieval data can significantly lower the likelihood of revealing training data. The work provides practical insights for securing RAG deployments and highlights a privacy-preserving balance between retrieval usage and model memorization risks.

Abstract

Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data, where data privacy is a pivotal concern. Whereas extensive research has demonstrated the privacy risks of large language models (LLMs), the RAG technique could potentially reshape the inherent behaviors of LLM generation, posing new privacy issues that are currently under-explored. In this work, we conduct extensive empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database. Despite the new risk brought by RAG on the retrieval data, we further reveal that RAG can mitigate the leakage of the LLMs' training data. Overall, we provide new insights in this paper for privacy protection of retrieval-augmented LLMs, which benefit both LLMs and RAG systems builders. Our code is available at https://github.com/phycholosogy/RAG-privacy.

The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)

TL;DR

The paper investigates privacy risks in retrieval-augmented generation (RAG) systems, revealing that adversaries can extract private data from external retrieval databases via composite prompts while retrieval augmentation can reduce memorization leakage from LLM training data. It introduces targeted and untargeted attack frameworks on retrieval data and training data, evaluates them on Enron and HealthcareMagic datasets, and analyzes mitigation strategies including re-ranking, summarization, and set-distance thresholds. Key findings show substantial leakage from retrieval data under certain prompts, limited mitigation from reranking, and that retrieval data can significantly lower the likelihood of revealing training data. The work provides practical insights for securing RAG deployments and highlights a privacy-preserving balance between retrieval usage and model memorization risks.

Abstract

Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data, where data privacy is a pivotal concern. Whereas extensive research has demonstrated the privacy risks of large language models (LLMs), the RAG technique could potentially reshape the inherent behaviors of LLM generation, posing new privacy issues that are currently under-explored. In this work, we conduct extensive empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database. Despite the new risk brought by RAG on the retrieval data, we further reveal that RAG can mitigate the leakage of the LLMs' training data. Overall, we provide new insights in this paper for privacy protection of retrieval-augmented LLMs, which benefit both LLMs and RAG systems builders. Our code is available at https://github.com/phycholosogy/RAG-privacy.
Paper Structure (48 sections, 4 equations, 6 figures, 23 tables)

This paper contains 48 sections, 4 equations, 6 figures, 23 tables.

Figures (6)

  • Figure 1: The RAG system and potential risks.
  • Figure 2: Ablation study on command part. (R) means Repeat Contexts and (RG) means Rouge Contexts
  • Figure 3: Ablation study on number of retrieved docs per query k.
  • Figure 4: Potential post-processing mitigation strategies. The impact of reranking on (a) targeted attacks,(b) untargetted attacks; and the impact of summarization on (c) untargeted attacks and (d) targeted attacks
  • Figure 5: The impact of retrieval threshold on performance and privacy leakage
  • ...and 1 more figures