Table of Contents
Fetching ...

Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

Zexiong Ma, Shengnan An, Zeqi Lin, Yanzhen Zou, Jian-Guang Lou, Bing Xie

TL;DR

This work tackles hallucination in retrieval-augmented generation when using parallel context extension by introducing DePaC, which combines context-aware negative training (NegTrain) with information-calibrated aggregation (ICA) to reduce fact fabrication and fact omission. NegTrain explicitly teaches the model to refuse answering when retrieved contexts do not support a question, while ICA emphasizes context windows that yield higher information gain as measured by $D_{KL}$-based changes in uncertainty. Across nine RAG tasks, DePaC consistently improves performance, mitigates hallucinations, and remains effective as the number of candidate documents increases, with ablations showing both NegTrain and ICA are essential. The method also generalizes across different LLM backbones, including Mistral-7B and Llama3-8B, highlighting practical impact for efficient long-context reasoning in real-world RAG applications.

Abstract

Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from hallucinations when adapted to RAG scenarios. In this paper, we propose DePaC (Dehallucinating Parallel Context Extension), which alleviates the hallucination problem with context-aware negative training and information-calibrated aggregation. DePaC is designed to alleviate two types of in-context hallucination: fact fabrication (i.e., LLMs present claims that are not supported by the contexts) and fact omission (i.e., LLMs fail to present claims that can be supported by the contexts). Specifically, (1) for fact fabrication, we apply the context-aware negative training that fine-tunes the LLMs with negative supervisions, thus explicitly guiding the LLMs to refuse to answer when contexts are not related to questions; (2) for fact omission, we propose the information-calibrated aggregation which prioritizes context windows with higher information increment from their contexts. The experimental results on nine RAG tasks demonstrate that DePaC significantly alleviates the two types of hallucination and consistently achieves better performances on these tasks.

Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

TL;DR

This work tackles hallucination in retrieval-augmented generation when using parallel context extension by introducing DePaC, which combines context-aware negative training (NegTrain) with information-calibrated aggregation (ICA) to reduce fact fabrication and fact omission. NegTrain explicitly teaches the model to refuse answering when retrieved contexts do not support a question, while ICA emphasizes context windows that yield higher information gain as measured by -based changes in uncertainty. Across nine RAG tasks, DePaC consistently improves performance, mitigates hallucinations, and remains effective as the number of candidate documents increases, with ablations showing both NegTrain and ICA are essential. The method also generalizes across different LLM backbones, including Mistral-7B and Llama3-8B, highlighting practical impact for efficient long-context reasoning in real-world RAG applications.

Abstract

Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from hallucinations when adapted to RAG scenarios. In this paper, we propose DePaC (Dehallucinating Parallel Context Extension), which alleviates the hallucination problem with context-aware negative training and information-calibrated aggregation. DePaC is designed to alleviate two types of in-context hallucination: fact fabrication (i.e., LLMs present claims that are not supported by the contexts) and fact omission (i.e., LLMs fail to present claims that can be supported by the contexts). Specifically, (1) for fact fabrication, we apply the context-aware negative training that fine-tunes the LLMs with negative supervisions, thus explicitly guiding the LLMs to refuse to answer when contexts are not related to questions; (2) for fact omission, we propose the information-calibrated aggregation which prioritizes context windows with higher information increment from their contexts. The experimental results on nine RAG tasks demonstrate that DePaC significantly alleviates the two types of hallucination and consistently achieves better performances on these tasks.

Paper Structure

This paper contains 36 sections, 16 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: DePaC significantly reduces the occurrence of hallucinations in responses within RAG scenarios.
  • Figure 2: Existing PCE approaches face two types of in-context hallucination issues when applied to RAG: (1) Fact fabrication. LLM generates fabricated answers that are inconsistent with the contextual facts. (2) Fact omission. The absence of required information in certain windows disproportionately influence the aggregation function, leading to disregard critical information in other windows.
  • Figure 3: DePaC consists of two key components: (1) a context-aware negative training technique to alleviate fact fabrication, and (2) an information-calibrated aggregation method to alleviate fact omission.
  • Figure 4: Attention pattern and execution time comparison between DePaC and vanilla inference. The execution time of DePaC increases linearly with context length, while vanilla's complexity grows quadratically.
  • Figure 5: Hallucination percentage in responses for the information seeking tasks.
  • ...and 6 more figures