Table of Contents
Fetching ...

Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation

Yuanjie Lyu, Zihan Niu, Zheyong Xie, Chao Zhang, Tong Xu, Yang Wang, Enhong Chen

TL;DR

The Retrieve-Plan-Generation framework is proposed, utilizing a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering and comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of the approach.

Abstract

Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in LLM generation, inputting the entire document may introduce off-topic information, causing the model to deviate from the central topic and affecting the relevance of the generated content. To address these issues, we propose the Retrieve-Plan-Generation (RPG) framework. RPG generates plan tokens to guide subsequent generation in the plan stage. In the answer stage, the model selects relevant fine-grained paragraphs based on the plan and uses them for further answer generation. This plan-answer process is repeated iteratively until completion, enhancing generation relevance by focusing on specific topics. To implement this framework efficiently, we utilize a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering. We comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of our approach.

Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation

TL;DR

The Retrieve-Plan-Generation framework is proposed, utilizing a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering and comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of the approach.

Abstract

Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in LLM generation, inputting the entire document may introduce off-topic information, causing the model to deviate from the central topic and affecting the relevance of the generated content. To address these issues, we propose the Retrieve-Plan-Generation (RPG) framework. RPG generates plan tokens to guide subsequent generation in the plan stage. In the answer stage, the model selects relevant fine-grained paragraphs based on the plan and uses them for further answer generation. This plan-answer process is repeated iteratively until completion, enhancing generation relevance by focusing on specific topics. To implement this framework efficiently, we utilize a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering. We comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of our approach.
Paper Structure (38 sections, 3 equations, 4 figures, 16 tables, 1 algorithm)

This paper contains 38 sections, 3 equations, 4 figures, 16 tables, 1 algorithm.

Figures (4)

  • Figure 1: The retrieval documents contain off-topic paragraphs (highlighted in yellow), causing potential deviations in RAG outputs. By planning first (highlighted in green), selecting relevant fine-grained paragraphs, and then answering, the plan-answer iteration ensures a more consistent and relevant generation.
  • Figure 2: Illustration of the proposed RPG. The left shows the training process, where plan and answer tasks use the same example data, different loss functions, and train two task-specific prompts simultaneously. The right shows the inference process, where the plan-answer process is repeated iteratively until completion.
  • Figure 3: Illustration of the data processing for one of the segments in a sample.
  • Figure 4: Training scale analysis.