Table of Contents
Fetching ...

Injecting External Knowledge into the Reasoning Process Enhances Retrieval-Augmented Generation

Minghao Tang, Shiyu Ni, Jiafeng Guo, Keping Bi

TL;DR

This work tackles the vulnerability of retrieval-augmented generation (RAG) to noisy retrieved passages by proposing Passage Injection, which explicitly inserts retrieved passages into the reasoning phase of reasoning-enhanced LLMs. The method is validated across BM25 retrieval and four QA datasets using multiple reasoning-enhanced models, showing improved overall performance and, crucially, robustness to both random and counterfactual noise. Analyses reveal greater gains for multi-hop questions and show that the improvements largely stem from better handling noisy information rather than merely leveraging gold passages. The approach offers a practical path to more reliable RAG systems and is accompanied by publicly available code for reproducibility and further exploration.

Abstract

Retrieval-augmented generation (RAG) has been widely adopted to augment large language models (LLMs) with external knowledge for knowledge-intensive tasks. However, its effectiveness is often undermined by the presence of noisy (i.e., low-quality) retrieved passages. Enhancing LLMs' robustness to such noise is critical for improving the reliability of RAG systems. Recent advances have equipped LLMs with strong reasoning and self-reflection capabilities, allowing them to identify and correct errors in their reasoning process. Inspired by this ability, we propose Passage Injection-a simple yet effective method that explicitly incorporates retrieved passages into LLMs' reasoning process, aiming to enhance the model's ability to recognize and resist noisy passages. We validate Passage Injection under general RAG settings using BM25 as the retriever. Experiments on four reasoning-enhanced LLMs across four factual QA datasets demonstrate that Passage Injection significantly improves overall RAG performance. Further analysis on two noisy retrieval settings-random noise, where the model is provided irrelevant passages, and counterfactual noise, where it is given misleading passages-shows that Passage Injection consistently improves robustness. Controlled experiments confirm that Passage Injection can also effectively leverage helpful passages. These findings suggest that incorporating passages in LLMs' reasoning process is a promising direction for building more robust RAG systems. The code can be found \href{here}{https://github.com/Trustworthy-Information-Access/Passage-Injection}.

Injecting External Knowledge into the Reasoning Process Enhances Retrieval-Augmented Generation

TL;DR

This work tackles the vulnerability of retrieval-augmented generation (RAG) to noisy retrieved passages by proposing Passage Injection, which explicitly inserts retrieved passages into the reasoning phase of reasoning-enhanced LLMs. The method is validated across BM25 retrieval and four QA datasets using multiple reasoning-enhanced models, showing improved overall performance and, crucially, robustness to both random and counterfactual noise. Analyses reveal greater gains for multi-hop questions and show that the improvements largely stem from better handling noisy information rather than merely leveraging gold passages. The approach offers a practical path to more reliable RAG systems and is accompanied by publicly available code for reproducibility and further exploration.

Abstract

Retrieval-augmented generation (RAG) has been widely adopted to augment large language models (LLMs) with external knowledge for knowledge-intensive tasks. However, its effectiveness is often undermined by the presence of noisy (i.e., low-quality) retrieved passages. Enhancing LLMs' robustness to such noise is critical for improving the reliability of RAG systems. Recent advances have equipped LLMs with strong reasoning and self-reflection capabilities, allowing them to identify and correct errors in their reasoning process. Inspired by this ability, we propose Passage Injection-a simple yet effective method that explicitly incorporates retrieved passages into LLMs' reasoning process, aiming to enhance the model's ability to recognize and resist noisy passages. We validate Passage Injection under general RAG settings using BM25 as the retriever. Experiments on four reasoning-enhanced LLMs across four factual QA datasets demonstrate that Passage Injection significantly improves overall RAG performance. Further analysis on two noisy retrieval settings-random noise, where the model is provided irrelevant passages, and counterfactual noise, where it is given misleading passages-shows that Passage Injection consistently improves robustness. Controlled experiments confirm that Passage Injection can also effectively leverage helpful passages. These findings suggest that incorporating passages in LLMs' reasoning process is a promising direction for building more robust RAG systems. The code can be found \href{here}{https://github.com/Trustworthy-Information-Access/Passage-Injection}.

Paper Structure

This paper contains 16 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: (a) An example where the retrieved passages contain misleading information. The passage incorrectly states that Northern Ireland is part of the United States, while the correct answer is the United Kingdom. In this case, Vanilla RAG mistakenly follows the external misleading information and produces an incorrect answer. In contrast, Passage Injection identifies the external misinformation and generates the correct answer, demonstrating its enhanced robustness to noisy passages. (b) The performance of Qwen3-14B and Qwen3-32B under general RAG settings with different methods. We use BM25 to retrieve documents and provide the top 1, 3, and 5 most relevant documents to the model. The results show that Passage Injection significantly improves RAG performance across different numbers of passages.
  • Figure 2: F1 scores under different noise settings. Left: Average performance on four factual QA datasets with random unrelated passages. Right: Performance on ConFiQA with counterfactual, misleading contexts.
  • Figure 3: Average performance on 2WikiMultihopQA and HotpotQA using only gold passages. "Distill-Qwen-32B" refers to DeepSeek-R1-Distill-Qwen-32B.