Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Song Wang; Zihan Chen; Peng Wang; Zhepei Wei; Zhen Tan; Yu Meng; Cong Shen; Jundong Li

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Song Wang, Zihan Chen, Peng Wang, Zhepei Wei, Zhen Tan, Yu Meng, Cong Shen, Jundong Li

TL;DR

WinnowRAG tackles the noise problem in retrieval-augmented generation by a two-stage, training-free framework that first clusters retrieved documents by query relevance and then uses a critic-guided, multi-agent winnowing process to merge and filter content. The method assigns cluster-specific LLM agents in Stage I and employs embedding-space Ellipse and Hyperbola merging in Stage II to retain useful documents while discarding noise, all without task-specific fine-tuning. Empirical results across knowledge-intensive benchmarks show WinnowRAG consistently outperforms training-free baselines, and ablations confirm the critical roles of clustering, merging, and iterative critique. The approach is model-agnostic, scalable, and applicable to diverse domains, offering a practical path to more reliable RAG systems without costly fine-tuning.

Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by integrating external knowledge sources to address their limitations in accessing up-to-date or specialized information. A natural strategy to increase the likelihood of retrieving relevant information is to expand the number of retrieved documents. However, involving more documents could introduce significant noise, as many documents may be irrelevant or misleading, thereby reducing the overall accuracy of the generated responses. To overcome the challenge associated with handling a larger number of documents, we propose WinnowRAG, a novel RAG framework designed to systematically filter out noisy documents while preserving valuable content -- a process we refer to as winnowing. WinnowRAG operates in two stages: In Stage I, we perform query-aware clustering to group similar documents and form distinct topic clusters. Each cluster is assigned to an LLM agent for generating a unique answer. In Stage II, we perform winnowing, wherein a critic LLM evaluates the outputs of multiple agents and iteratively separates useful documents from noisy ones. To retain useful documents when discarding agents, we propose two strategic merging techniques to ensure that only relevant knowledge is used for generating the final response. Crucially, WinnowRAG is model-agnostic and does not require any model fine-tuning, making it easily adaptable to various tasks. Extensive experiments on various realistic datasets demonstrate the effectiveness of WinnowRAG over state-of-the-art baselines.

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

TL;DR

Abstract

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)