Table of Contents
Fetching ...

FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters

Zhilin Liang, Yuxiang Wang, Zimu Zhou, Hainan Zhang, Boyi Liu, Yongxin Tong

TL;DR

This paper addresses the challenge of performing knowledge-grounded generation when raw documents cannot be shared due to privacy constraints. It introduces FedMosaic, a federated RAG framework built on parametric adapters that encodes local documents into multi-document adapters with document-specific masks and employs selective, conflict-aware aggregation across silos. The approach delivers an average $10.9\%$ accuracy improvement over state-of-the-art methods while reducing silo storage by $78.8\%$–$86.3\%$ and communication by $91.4\%$, all without transmitting raw documents. The results demonstrate FedMosaic’s effectiveness, scalability to larger models, and strong privacy guarantees, making it a practical solution for distributed, privacy-sensitive knowledge bases.

Abstract

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge to improve factuality and reduce hallucinations. Yet most deployments assume a centralized corpus, which is infeasible in privacy aware domains where knowledge remains siloed. This motivates federated RAG (FedRAG), where a central LLM server collaborates with distributed silos without sharing raw documents. In context RAG violates this requirement by transmitting verbatim documents, whereas parametric RAG encodes documents into lightweight adapters that merge with a frozen LLM at inference, avoiding raw-text exchange. We adopt the parametric approach but face two unique challenges induced by FedRAG: high storage and communication from per-document adapters, and destructive aggregation caused by indiscriminately merging multiple adapters. We present FedMosaic, the first federated RAG framework built on parametric adapters. FedMosaic clusters semantically related documents into multi-document adapters with document-specific masks to reduce overhead while preserving specificity, and performs selective adapter aggregation to combine only relevance-aligned, nonconflicting adapters. Experiments show that FedMosaic achieves an average 10.9% higher accuracy than state-of-the-art methods in four categories, while lowering storage costs by 78.8% to 86.3% and communication costs by 91.4%, and never sharing raw documents.

FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters

TL;DR

This paper addresses the challenge of performing knowledge-grounded generation when raw documents cannot be shared due to privacy constraints. It introduces FedMosaic, a federated RAG framework built on parametric adapters that encodes local documents into multi-document adapters with document-specific masks and employs selective, conflict-aware aggregation across silos. The approach delivers an average accuracy improvement over state-of-the-art methods while reducing silo storage by and communication by , all without transmitting raw documents. The results demonstrate FedMosaic’s effectiveness, scalability to larger models, and strong privacy guarantees, making it a practical solution for distributed, privacy-sensitive knowledge bases.

Abstract

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge to improve factuality and reduce hallucinations. Yet most deployments assume a centralized corpus, which is infeasible in privacy aware domains where knowledge remains siloed. This motivates federated RAG (FedRAG), where a central LLM server collaborates with distributed silos without sharing raw documents. In context RAG violates this requirement by transmitting verbatim documents, whereas parametric RAG encodes documents into lightweight adapters that merge with a frozen LLM at inference, avoiding raw-text exchange. We adopt the parametric approach but face two unique challenges induced by FedRAG: high storage and communication from per-document adapters, and destructive aggregation caused by indiscriminately merging multiple adapters. We present FedMosaic, the first federated RAG framework built on parametric adapters. FedMosaic clusters semantically related documents into multi-document adapters with document-specific masks to reduce overhead while preserving specificity, and performs selective adapter aggregation to combine only relevance-aligned, nonconflicting adapters. Experiments show that FedMosaic achieves an average 10.9% higher accuracy than state-of-the-art methods in four categories, while lowering storage costs by 78.8% to 86.3% and communication costs by 91.4%, and never sharing raw documents.
Paper Structure (28 sections, 14 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 28 sections, 14 equations, 6 figures, 5 tables, 2 algorithms.

Figures (6)

  • Figure 1: Federated RAG with locality constraint.
  • Figure 2: Accuracy curves of parametric RAG for (a) grouped documents under a LoRA adapter across different training epochs, and (b) aggregation of multiple LoRA adapters.
  • Figure 3: FedMosaic architecture and workflow.
  • Figure 4: Storage and communication overhead.
  • Figure 5: Impact of document mask.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Weighted Subgraph Selection Problem