Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

Jiatao Li; Xinyu Hu; Xunjian Yin; Xiaojun Wan

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

Jiatao Li, Xinyu Hu, Xunjian Yin, Xiaojun Wan

TL;DR

The paper addresses how self-generated documents (Self-Docs), produced entirely from an LLM’s internal memory, affect retrieval-augmented generation (RAG) performance across knowledge-intensive QA tasks. It first assesses baseline utility (RQ1), then builds a taxonomy of Self-Docs using Systemic Functional Linguistics (RQ2), and finally explores integrating Self-Docs with external sources (RQ3) such as Wikipedia, including direct mixing and style-transformation approaches. Key contributions include validating the utility of Self-Docs, proposing an SFL-based eight-type taxonomy, and providing actionable guidelines for task-aligned Self-Doc design and external-content harmonization, with evidence that styled integration often yields robust improvements. The study demonstrates that model scale, Self-Doc attributes (tone, granularity, structure), and careful external integration jointly maximize RAG performance on open-domain QA, multi-hop reasoning, fact verification, and long-form answers. These findings offer practical guidance for building knowledge-intensive QA systems that leverage Self-Docs alongside retrieved content while cautioning about task-specific variability and ethical considerations around factual accuracy.

Abstract

The integration of documents generated by LLMs themselves (Self-Docs) alongside retrieved documents has emerged as a promising strategy for retrieval-augmented generation systems. However, previous research primarily focuses on optimizing the use of Self-Docs, with their inherent properties remaining underexplored. To bridge this gap, we first investigate the overall effectiveness of Self-Docs, identifying key factors that shape their contribution to RAG performance (RQ1). Building on these insights, we develop a taxonomy grounded in Systemic Functional Linguistics to compare the influence of various Self-Docs categories (RQ2) and explore strategies for combining them with external sources (RQ3). Our findings reveal which types of Self-Docs are most beneficial and offer practical guidelines for leveraging them to achieve significant improvements in knowledge-intensive question answering tasks.

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

TL;DR

Abstract

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)