Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

Siya Qi; Yudong Chen; Runcong Zhao; Qinglin Zhu; Zhanghao Hu; Wei Liu; Yulan He; Zheng Yuan; Lin Gui

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

Siya Qi, Yudong Chen, Runcong Zhao, Qinglin Zhu, Zhanghao Hu, Wei Liu, Yulan He, Zheng Yuan, Lin Gui

TL;DR

This work tackles contextual hallucinations in retrieval-augmented generation by proposing a frequency-aware analysis of internal attention. By treating attention as discrete signals and extracting high-frequency energy through spectral operators like the Discrete Fourier Transform, Discrete Wavelet Transform, and the discrete Laplacian, the authors build a lightweight detector using a linear classifier to identify token- and span-level hallucinations. The approach yields consistent gains over verification-based, internal-representation-based, and other attention-based baselines across multiple models and tasks (RAGTruth and HalluRAG), with Fourier-based features generally performing best. Layer-wise and head-wise analyses reveal that mid-layer, sparse, context-focused high-frequency attention patterns are particularly informative for grounding, suggesting frequency-aware signals as a robust intrinsic diagnostic and potential mitigation tool for unreliable LLM generation.

Abstract

Hallucination detection is critical for ensuring the reliability of large language models (LLMs) in context-based generation. Prior work has explored intrinsic signals available during generation, among which attention offers a direct view of grounding behavior. However, existing approaches typically rely on coarse summaries that fail to capture fine-grained instabilities in attention. Inspired by signal processing, we introduce a frequency-aware perspective on attention by analyzing its variation during generation. We model attention distributions as discrete signals and extract high-frequency components that reflect rapid local changes in attention. Our analysis reveals that hallucinated tokens are associated with high-frequency attention energy, reflecting fragmented and unstable grounding behavior. Based on this insight, we develop a lightweight hallucination detector using high-frequency attention features. Experiments on the RAGTruth and HalluRAG benchmarks show that our approach achieves performance gains over verification-based, internal-representation-based, and attention-based methods across models and tasks.

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

TL;DR

Abstract

Paper Structure (58 sections, 5 theorems, 50 equations, 10 figures, 10 tables)

This paper contains 58 sections, 5 theorems, 50 equations, 10 figures, 10 tables.

Introduction
Background
Attention-based Hallucination Detection
A Frequency-based View of Attention
Frequency-Aware Attention Modeling
Motivation and Intuition
Problem Setup
Attention as Discrete Signals
Energy-Based High-Frequency Instability
Experiment Setting
Baselines
Implementation Settings
Results and Analysis
Overall Performance
Span-level Hallucination Detection
...and 43 more sections

Key Result

Lemma 1.5

Under Assumption ass:labels_app, for any $j\ge 1$, the probability of

Figures (10)

Figure 1: Attention weights over context and previously generated tokens for a grounded token (blue, "rainy") and a hallucinated token (red, "December") in a context-based QA example.
Figure 2: Three hypotheses for identifying hallucination tokens from incoming attention patterns. We illustrate three representative assumptions for distinguishing well-grounded tokens (✓) from potential hallucinations (✗) based on the incoming attention to the next generated token.
Figure 3: Overview of frequency-aware attention modeling for hallucination detection. Attention weights are extracted from each layer and head ($L$ layers and $H$ heads in total), treated as token-level signals, and decomposed using high-pass filtering to isolate high-frequency variations ($\mathcal{F}_{\text{high}}$), whose energy is aggregated for hallucination detection.
Figure 4: Comparing full-, low-, and high-pass Fourier attention features. Average AUROC across models under token- and span-level evaluation settings.
Figure 5: Layer-wise importance of high-frequency Fourier-high attention features for LLaMA-7B.
...and 5 more figures

Theorems & Definitions (10)

Lemma 1.5: Switch probability
proof
Lemma 1.6: Logit adjacent-difference energy
proof
Lemma 1.7: Pairwise softmax difference identity
proof
Lemma 1.8: Softmax transfer lower bound
proof
Theorem 1.9: Monotone $K$-dependent lower bound for attention roughness
proof

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

TL;DR

Abstract

Detecting Contextual Hallucinations in LLMs with Frequency-Aware Attention

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (10)