Table of Contents
Fetching ...

Copy-Paste to Mitigate Large Language Model Hallucinations

Yongchao Long, Xian Wu, Yingying Zhang, Xianbin Wen, Yuxi Zhou, Shenda Hong

TL;DR

This work tackles contextual faithfulness in retrieval-augmented LLMs by addressing hallucinations that arise when internal parametric knowledge competes with external context. It introduces CopyPasteLLM, a two-stage approach that first generates high-copying responses via three CopyPaste-Prompting methods and then internalizes this preference with direct preference optimization, achieving substantial data-efficient improvements on FaithEval, ConFiQA, and PubMedQA with as few as 365 training samples. A novel Context-Parameter Copying Capturing analysis reveals that CopyPasteLLM strengthens reliance on contextual evidence while recalibrating internal parametric confidence, rather than merely enhancing contextual representations. The work demonstrates strong empirical gains, provides an interpretable mechanistic pipeline, and offers a reproducible framework for reducing RAG hallucinations in knowledge-intensive domains.

Abstract

While Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to generate contextually grounded responses, contextual faithfulness remains challenging as LLMs may not consistently trust provided context, leading to hallucinations that undermine reliability. We observe an inverse correlation between response copying degree and context-unfaithful hallucinations on RAGTruth, suggesting that higher copying degrees reduce hallucinations by fostering genuine contextual belief. We propose CopyPasteLLM, obtained through two-stage high-copying response preference training. We design three prompting methods to enhance copying degree, demonstrating that high-copying responses achieve superior contextual faithfulness and hallucination control. These approaches enable a fully automated pipeline that transforms generated responses into high-copying preference data for training CopyPasteLLM. On FaithEval, ConFiQA and PubMedQA, CopyPasteLLM achieves best performance in both counterfactual and original contexts, remarkably with 12.2% to 24.5% accuracy improvements on FaithEval over the best baseline, while requiring only 365 training samples -- 1/50th of baseline data. To elucidate CopyPasteLLM's effectiveness, we propose the Context-Parameter Copying Capturing algorithm. Interestingly, this reveals that CopyPasteLLM recalibrates reliance on internal parametric knowledge rather than external knowledge during generation. All codes are available at https://github.com/longyongchao/CopyPasteLLM

Copy-Paste to Mitigate Large Language Model Hallucinations

TL;DR

This work tackles contextual faithfulness in retrieval-augmented LLMs by addressing hallucinations that arise when internal parametric knowledge competes with external context. It introduces CopyPasteLLM, a two-stage approach that first generates high-copying responses via three CopyPaste-Prompting methods and then internalizes this preference with direct preference optimization, achieving substantial data-efficient improvements on FaithEval, ConFiQA, and PubMedQA with as few as 365 training samples. A novel Context-Parameter Copying Capturing analysis reveals that CopyPasteLLM strengthens reliance on contextual evidence while recalibrating internal parametric confidence, rather than merely enhancing contextual representations. The work demonstrates strong empirical gains, provides an interpretable mechanistic pipeline, and offers a reproducible framework for reducing RAG hallucinations in knowledge-intensive domains.

Abstract

While Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to generate contextually grounded responses, contextual faithfulness remains challenging as LLMs may not consistently trust provided context, leading to hallucinations that undermine reliability. We observe an inverse correlation between response copying degree and context-unfaithful hallucinations on RAGTruth, suggesting that higher copying degrees reduce hallucinations by fostering genuine contextual belief. We propose CopyPasteLLM, obtained through two-stage high-copying response preference training. We design three prompting methods to enhance copying degree, demonstrating that high-copying responses achieve superior contextual faithfulness and hallucination control. These approaches enable a fully automated pipeline that transforms generated responses into high-copying preference data for training CopyPasteLLM. On FaithEval, ConFiQA and PubMedQA, CopyPasteLLM achieves best performance in both counterfactual and original contexts, remarkably with 12.2% to 24.5% accuracy improvements on FaithEval over the best baseline, while requiring only 365 training samples -- 1/50th of baseline data. To elucidate CopyPasteLLM's effectiveness, we propose the Context-Parameter Copying Capturing algorithm. Interestingly, this reveals that CopyPasteLLM recalibrates reliance on internal parametric knowledge rather than external knowledge during generation. All codes are available at https://github.com/longyongchao/CopyPasteLLM

Paper Structure

This paper contains 45 sections, 2 equations, 9 figures, 5 tables, 4 algorithms.

Figures (9)

  • Figure 1: Upper: Response composition patterns comparison between CopyPaste and mainstream approaches. Lower: Inverse correlation between copying degree and faithfulness hallucination across different models. Kernel $\blacksquare$ show copying degree; Bar $\blacksquare$ show hallucination.
  • Figure 2: Two-stage CopyPaste pipeline: Stage 1 constructs high-copying responses; Stage 2 filters, judges, stamps answers, and aligns preferences to train CopyPasteLLM.
  • Figure 3: Logits power distribution across response lengths for contextual (CTX) and parametric (Para.) knowledge. Values above x=0 indicate CTX logits power, values below x=0 indicate Para. logits power (negated for visualization).
  • Figure 4: Dimensionality reduction visualization of hidden states distributions between contextual (CTX) and parametric (Para.) knowledge on PubMedQA dataset across two base models. Each subplot shows pairwise comparisons with marginal KDE distributions and confidence ellipses. See Appendix Figures \ref{['fig:hidden_states_ragtruth']} and \ref{['fig:hidden_states_faitheval']} for RAGTruth and FaithEval.
  • Figure 5: Copying degree across models and datasets. Point size represents copy density ($\delta$) values converted to circular area.
  • ...and 4 more figures