Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation
Sha Li, Naren Ramakrishnan
TL;DR
This paper tackles the unreliability and inefficiency of vanilla retrieval-augmented generation by introducing Oreo, a plug-in context reconstructor that refines retrieved chunks into concise, query-focused context. Oreo employs a retrieve-reconstruct-then-generate pipeline trained in three stages—supervised fine-tuning, contrastive multitask learning, and reinforcement learning alignment—to align reconstructed context with generator needs. It demonstrates consistent gains on single- and multi-hop open-domain QA tasks, while substantially reducing input length and latency and showing robustness to noise and order perturbations. The work advances practical RAG systems by enabling seamless integration with existing retrievers and generators, with strong implications for scalable, factual QA in real-world settings.
Abstract
Retrieval-Augmented Generation (RAG) aims to augment the capabilities of Large Language Models (LLMs) by retrieving and incorporate external documents or chunks prior to generation. However, even improved retriever relevance can brings erroneous or contextually distracting information, undermining the effectiveness of RAG in downstream tasks. We introduce a compact, efficient, and pluggable module designed to refine retrieved chunks before using them for generation. The module aims to extract and reorganize the most relevant and supportive information into a concise, query-specific format. Through a three-stage training paradigm - comprising supervised fine - tuning, contrastive multi-task learning, and reinforcement learning-based alignment - it prioritizes critical knowledge and aligns it with the generator's preferences. This approach enables LLMs to produce outputs that are more accurate, reliable, and contextually appropriate.
