Meta Knowledge for Retrieval Augmented Large Language Models
Laurent Mombaerts, Terry Ding, Adi Banerjee, Florian Felice, Jonathan Taws, Tarik Borogovac
TL;DR
The paper tackles the challenge of retrieving and synthesizing information from large, diverse knowledge bases in retrieval augmented generation (RAG) systems. It introduces a data-centric workflow called PR3 (prepare-then-rewrite-then-retrieve-then-read) that generates per-document metadata and synthetic QA, and uses Meta Knowledge Summaries (MK Summaries) to condition personalized query augmentation. The authors show that augmented queries based on synthetic QA outperform traditional document-chunking RAG, and that MK Summary-based augmentation further improves recall, breadth, depth, and relevancy, without requiring model fine-tuning, while remaining cost-effective. The approach advances end-to-end reasoning across multiple documents and demonstrates practical potential for domain-specific knowledge retrieval, with limitations around metadata generation and opportunities for multi-hop reasoning in future work.
Abstract
Retrieval Augmented Generation (RAG) is a technique used to augment Large Language Models (LLMs) with contextually relevant, time-critical, or domain-specific information without altering the underlying model parameters. However, constructing RAG systems that can effectively synthesize information from large and diverse set of documents remains a significant challenge. We introduce a novel data-centric RAG workflow for LLMs, transforming the traditional retrieve-then-read system into a more advanced prepare-then-rewrite-then-retrieve-then-read framework, to achieve higher domain expert-level understanding of the knowledge base. Our methodology relies on generating metadata and synthetic Questions and Answers (QA) for each document, as well as introducing the new concept of Meta Knowledge Summary (MK Summary) for metadata-based clusters of documents. The proposed innovations enable personalized user-query augmentation and in-depth information retrieval across the knowledge base. Our research makes two significant contributions: using LLMs as evaluators and employing new comparative performance metrics, we demonstrate that (1) using augmented queries with synthetic question matching significantly outperforms traditional RAG pipelines that rely on document chunking (p < 0.01), and (2) meta knowledge-augmented queries additionally significantly improve retrieval precision and recall, as well as the final answers breadth, depth, relevancy, and specificity. Our methodology is cost-effective, costing less than $20 per 2000 research papers using Claude 3 Haiku, and can be adapted with any fine-tuning of either the language or embedding models to further enhance the performance of end-to-end RAG pipelines.
