Table of Contents
Fetching ...

Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

Ruihan Wu, Erchi Wang, Zhiyuan Zhang, Yu-Xiang Wang

TL;DR

This work tackles the challenge of answering multiple queries with retrieval-augmented generation (RAG) while preserving privacy of a sensitive external corpus. It introduces MuRAG and MuRAG-Ada, two differentially private multi-query RAG algorithms that use per-document Rényi DP filters to bound privacy loss based on how often each document is retrieved, avoiding naive composition over queries. MuRAG employs a fixed relevance threshold, whereas MuRAG-Ada privately releases query-specific thresholds to better handle correlated queries; both can incorporate any single-query private RAG method and prove DP guarantees. Empirical results across standard QA tasks, a correlated multi-hop dataset, and a privacy-sensitive ChatDoctor setting show that the methods achieve practical utility for hundreds of queries within $\varepsilon \approx 10$, and MuRAG-Ada especially excels with correlated queries while defending against multi-query membership inference attacks. The work offers a principled, scalable path toward private RAG deployment in real-world, multi-query environments.

Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and datasets demonstrate that the proposed methods scale to hundreds of queries within a practical DP budget ($\varepsilon\approx10$), while preserving meaningful utility.

Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

TL;DR

This work tackles the challenge of answering multiple queries with retrieval-augmented generation (RAG) while preserving privacy of a sensitive external corpus. It introduces MuRAG and MuRAG-Ada, two differentially private multi-query RAG algorithms that use per-document Rényi DP filters to bound privacy loss based on how often each document is retrieved, avoiding naive composition over queries. MuRAG employs a fixed relevance threshold, whereas MuRAG-Ada privately releases query-specific thresholds to better handle correlated queries; both can incorporate any single-query private RAG method and prove DP guarantees. Empirical results across standard QA tasks, a correlated multi-hop dataset, and a privacy-sensitive ChatDoctor setting show that the methods achieve practical utility for hundreds of queries within , and MuRAG-Ada especially excels with correlated queries while defending against multi-query membership inference attacks. The work offers a principled, scalable path toward private RAG deployment in real-world, multi-query environments.

Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and datasets demonstrate that the proposed methods scale to hundreds of queries within a practical DP budget (), while preserving meaningful utility.

Paper Structure

This paper contains 27 sections, 7 theorems, 4 equations, 5 figures, 2 tables, 9 algorithms.

Key Result

Theorem 1

$\textsc{MuRAG}$ satisfies $\varepsilon$-differential privacy provided that the initial privacy budget assigned to each document $z \in D$ is at most $\varepsilon$.

Figures (5)

  • Figure 1: Histogram of document reuse across questions. Each bar shows how many questions a document appears in among the top-$K$ retrieved results ($K=50$). The x-axis indicates the number of questions per document, and the y-axis shows the count of such documents.
  • Figure 2: Privacy-Utility tradeoffs of our two proposed methods (MuRAG and MuRAG-Ada) compared to baselines across three pretrained LLMs and two categories of question sets.
  • Figure 3: Left: Privacy-utility tradeoffs of our two methods and baselines. Right: TPR-FPR curves of IA (Membership Inference Attack with multiple queries). Both experiments are conducted with Mistral-7B and ChatDoctor datasets.
  • Figure 4: Comparison of $M=1$ and $M=5$ in the individual privacy accounting framework. The left plot shows the retrieval precisions of two methods with $M=1,5$. Right three plots show the trade-off between the QA performance and the $\varepsilon_{\rm total}$ in DP.
  • Figure 5: Absolute error of releasing $\tau_t$ in MuRAG-Ada.

Theorems & Definitions (17)

  • Definition 1: Differential Privacy dmns06
  • Definition 2: Rényi Differential Privacy mironov2017renyi
  • Definition 3: Individual Rényi Differential Privacy
  • Definition 4: (Individual) Rényi Differential Privacy Filters feldman2021individual
  • Theorem 1: Privacy Guarantee of Algorithm \ref{['alg:dp_fix_tau']}
  • Theorem 2: Privacy Guarantee of Algorithm \ref{['alg:batch_dp_rag-v2']}
  • Lemma 1: Privacy Guarantee for Algorithm \ref{['alg:dp-rag-v2']}
  • proof
  • Lemma 2: Privacy guarantee of Algorithm \ref{['alg:naive_batch_dp_rag']}
  • proof
  • ...and 7 more