What do Geometric Hallucination Detection Metrics Actually Measure?

Eric Yeats; John Buckheit; Sarah Scullen; Brendan Kennedy; Loc Truong; Davis Brown; Bill Kay; Cliff Joslyn; Tegan Emerson; Michael J. Henry; John Emanuello; Henry Kvinge

What do Geometric Hallucination Detection Metrics Actually Measure?

Eric Yeats, John Buckheit, Sarah Scullen, Brendan Kennedy, Loc Truong, Davis Brown, Bill Kay, Cliff Joslyn, Tegan Emerson, Michael J. Henry, John Emanuello, Henry Kvinge

TL;DR

This work investigates what internal geometric statistics of LLMs actually measure about hallucinations across domains. By constructing a cross-domain, multi-type hallucination dataset and evaluating Hidden Score, Matrix Entropy, and Attention Score, it shows that different statistics detect different hallucination properties and that domain shift hurts cross-domain performance. A simple perturbation-based normalization is proposed and shown to yield large AUROC gains (up to about +34 to +40 points) in multi-domain settings, highlighting a practical path to robust internal-state hallucination detection. Overall, the paper clarifies the relationship between geometric signals and hallucination types and offers a scalable method to improve cross-domain reliability of internal-state detectors.

Abstract

Hallucination remains a barrier to deploying generative models in high-consequence applications. This is especially true in cases where external ground truth is not readily available to validate model outputs. This situation has motivated the study of geometric signals in the internal state of an LLM that are predictive of hallucination and require limited external knowledge. Given that there are a range of factors that can lead model output to be called a hallucination (e.g., irrelevance vs incoherence), in this paper we ask what specific properties of a hallucination these geometric statistics actually capture. To assess this, we generate a synthetic dataset which varies distinct properties of output associated with hallucination. This includes output correctness, confidence, relevance, coherence, and completeness. We find that different geometric statistics capture different types of hallucinations. Along the way we show that many existing geometric detection methods have substantial sensitivity to shifts in task domain (e.g., math questions vs. history questions). Motivated by this, we introduce a simple normalization method to mitigate the effect of domain shift on geometric statistics, leading to AUROC gains of +34 points in multi-domain settings.

What do Geometric Hallucination Detection Metrics Actually Measure?

TL;DR

Abstract

Paper Structure (26 sections, 4 equations, 6 figures, 1 table)

This paper contains 26 sections, 4 equations, 6 figures, 1 table.

Introduction
Geometric Hallucination Detection Metrics
What do Geometric Statistics Measure?
Dataset Design
Data Recording
Experimental Results and Analysis
Single-Domain Performance on Factual Incorrectness
Sensitivity to Alternative Hallucination Types
Domain Shift Harms Detection of Factual Incorrectness
Mitigating Domain Shift
Results
Conclusion
Related Work: LLM Hallucination Detection
Retrieval-Augmented Generation:
Consistency Methods:
...and 11 more sections

Figures (6)

Figure 1: Distributions of HS (left) and AS (right) for correct and incorrect (level 3) responses for each domain. Solid lines: distribution means. Shaded areas: one standard deviation.
Figure 2: Our perturbation normalization significantly reduces domain shift, leading to boosted detection performance for incorrectness.
Figure 3: The response of geometric statistics (each row) are plotted for each alternative hallucination type (each column).
Figure 4: Geometric statistics (each row) are correlated with (factual)incorrectness on each domain (each column), motivating their use as hallucination detector scores. However, geometric statistics are impacted by domain shift, harming their cross-domain performance.
Figure 5: Domain alignment of the three geometric statistics using our proposed perturbation normalization technique.
...and 1 more figures

What do Geometric Hallucination Detection Metrics Actually Measure?

TL;DR

Abstract

What do Geometric Hallucination Detection Metrics Actually Measure?

Authors

TL;DR

Abstract

Table of Contents

Figures (6)