Table of Contents
Fetching ...

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Boyi Deng, Wenjie Wang, Fengbin Zhu, Qifan Wang, Fuli Feng

TL;DR

CrAM introduces a non‑fine‑tuning, credibility‑aware RAG mechanism that downweights low‑credibility retrieved documents by modifying selected influential attention heads. The method identifies these heads via a causal tracing extension and adjusts attention weights at inference using document credibility scores, achieving robust improvements on two open‑domain QA benchmarks. CrAM outperforms non‑SFT baselines and often surpasses SFT‑based methods, demonstrating strong defense against misinformation with modest computational overhead. The work provides a practical, scalable approach to mitigating misinformation in RAG systems and includes code availability for reproducibility.

Abstract

Retrieval-Augmented Generation (RAG) can alleviate hallucinations of Large Language Models (LLMs) by referencing external documents. However, the misinformation in external documents may mislead LLMs' generation. To address this issue, we explore the task of "credibility-aware RAG", in which LLMs automatically adjust the influence of retrieved documents based on their credibility scores to counteract misinformation. To this end, we introduce a plug-and-play method named $\textbf{Cr}$edibility-aware $\textbf{A}$ttention $\textbf{M}$odification (CrAM). CrAM identifies influential attention heads in LLMs and adjusts their attention weights based on the credibility of the documents, thereby reducing the impact of low-credibility documents. Experiments on Natual Questions and TriviaQA using Llama2-13B, Llama3-8B, and Qwen1.5-7B show that CrAM improves the RAG performance of LLMs against misinformation pollution by over 20%, even surpassing supervised fine-tuning methods.

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

TL;DR

CrAM introduces a non‑fine‑tuning, credibility‑aware RAG mechanism that downweights low‑credibility retrieved documents by modifying selected influential attention heads. The method identifies these heads via a causal tracing extension and adjusts attention weights at inference using document credibility scores, achieving robust improvements on two open‑domain QA benchmarks. CrAM outperforms non‑SFT baselines and often surpasses SFT‑based methods, demonstrating strong defense against misinformation with modest computational overhead. The work provides a practical, scalable approach to mitigating misinformation in RAG systems and includes code availability for reproducibility.

Abstract

Retrieval-Augmented Generation (RAG) can alleviate hallucinations of Large Language Models (LLMs) by referencing external documents. However, the misinformation in external documents may mislead LLMs' generation. To address this issue, we explore the task of "credibility-aware RAG", in which LLMs automatically adjust the influence of retrieved documents based on their credibility scores to counteract misinformation. To this end, we introduce a plug-and-play method named edibility-aware ttention odification (CrAM). CrAM identifies influential attention heads in LLMs and adjusts their attention weights based on the credibility of the documents, thereby reducing the impact of low-credibility documents. Experiments on Natual Questions and TriviaQA using Llama2-13B, Llama3-8B, and Qwen1.5-7B show that CrAM improves the RAG performance of LLMs against misinformation pollution by over 20%, even surpassing supervised fine-tuning methods.
Paper Structure (34 sections, 6 equations, 20 figures, 4 tables)

This paper contains 34 sections, 6 equations, 20 figures, 4 tables.

Figures (20)

  • Figure 1: A comparison between RAG and credibility-aware RAG. Credibility-aware RAG considers credibility to reduce the impact of low-credibility documents.
  • Figure 2: Illustration of CrAM. Compared to RAG, CrAM first identifies influential attention heads and then modifies their attention weights based on the credibility scores of each document.
  • Figure 3: Performance comparison of CrAM and CAG-13B regarding the varying number of documents containing misinformation under ideal setting.
  • Figure 4: Performance change on NQ regarding the varying number of documents with misinformation.
  • Figure 5: Performance on NQ and TriviaQA regarding the dataset size for determining the influential attention head changes.
  • ...and 15 more figures