Table of Contents
Fetching ...

PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models

Nhat Hoang-Xuan, Minh Vu, My T. Thai, Manish Bhattarai

TL;DR

This paper tackles object hallucination in Large Vision-Language Models (LVLMs) by showing that hallucinations often arise when the model over-relies on prelim tokens rather than image content. It introduces PAS, a training-free detector that leverages attentional flow from prelim tokens to object tokens, and supports this with an information-theoretic interpretation based on mutual information; however, PAS offers a practical, on-the-fly alternative by using the attention signal rather than sampling across images. PAS computes a single scalar per object token, $s_{ ext{prel}}$, from layer $l$ attention heads, typically layer 0, enabling real-time detection with no extra forward passes. Experiments across LLaVA-1.5-7B, MiniGPT-4, and Shikra on MSCOCO and Pascal VOC show PAS achieving state-of-the-art AUROC compared to baselines such as NLL, Entropy, SVAR, IC, and GLSim, with low memory overhead. The work suggests that prelim overdependence marks an unstable, alternative operating mode of LVLMs and provides a practical tool for real-time filtering and intervention to improve reliability in LVLM deployments.

Abstract

Large vision-language models (LVLMs) are powerful, yet they remain unreliable due to object hallucinations. In this work, we show that in many hallucinatory predictions the LVLM effectively ignores the image and instead relies on previously generated output (prelim) tokens to infer new objects. We quantify this behavior via the mutual information between the image and the predicted object conditioned on the prelim, demonstrating that weak image dependence strongly correlates with hallucination. Building on this finding, we introduce the Prelim Attention Score (PAS), a lightweight, training-free signal computed from attention weights over prelim tokens. PAS requires no additional forward passes and can be computed on the fly during inference. Exploiting this previously overlooked signal, PAS achieves state-of-the-art object-hallucination detection across multiple models and datasets, enabling real-time filtering and intervention.

PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models

TL;DR

This paper tackles object hallucination in Large Vision-Language Models (LVLMs) by showing that hallucinations often arise when the model over-relies on prelim tokens rather than image content. It introduces PAS, a training-free detector that leverages attentional flow from prelim tokens to object tokens, and supports this with an information-theoretic interpretation based on mutual information; however, PAS offers a practical, on-the-fly alternative by using the attention signal rather than sampling across images. PAS computes a single scalar per object token, , from layer attention heads, typically layer 0, enabling real-time detection with no extra forward passes. Experiments across LLaVA-1.5-7B, MiniGPT-4, and Shikra on MSCOCO and Pascal VOC show PAS achieving state-of-the-art AUROC compared to baselines such as NLL, Entropy, SVAR, IC, and GLSim, with low memory overhead. The work suggests that prelim overdependence marks an unstable, alternative operating mode of LVLMs and provides a practical tool for real-time filtering and intervention to improve reliability in LVLM deployments.

Abstract

Large vision-language models (LVLMs) are powerful, yet they remain unreliable due to object hallucinations. In this work, we show that in many hallucinatory predictions the LVLM effectively ignores the image and instead relies on previously generated output (prelim) tokens to infer new objects. We quantify this behavior via the mutual information between the image and the predicted object conditioned on the prelim, demonstrating that weak image dependence strongly correlates with hallucination. Building on this finding, we introduce the Prelim Attention Score (PAS), a lightweight, training-free signal computed from attention weights over prelim tokens. PAS requires no additional forward passes and can be computed on the fly during inference. Exploiting this previously overlooked signal, PAS achieves state-of-the-art object-hallucination detection across multiple models and datasets, enabling real-time filtering and intervention.

Paper Structure

This paper contains 30 sections, 10 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Illustration of our findings and proposed method. Top (Investigate): We show that LVLM token generation depends on four token types and argue that prelim tokens are a vital, overlooked signal. Bottom Left (Hypothesize & Validate): We visually illustrate our core hypothesis: hallucinations (e.g., "bottle") occur when the model over-relies on prelim tokens, while real objects (e.g., "cups") rely more on the image. Bottom Right (Detect): Based on this insight, we propose PAS, which quantifies this over-reliance by summing the attention weights from prelim tokens to the object token.
  • Figure 2: Visualization of Prelim Attention Score (PAS) for a single sample. We show the per-token attention for a suffix of the prelim for a hallucinated and a real object token. Darker red indicates higher attention to the corresponding object token, and a higher PAS score indicates a higher chance of hallucination.
  • Figure 3: Score distributions for real and hallucinated object tokens across different models on MSCOCO dataset. The dashed lines denote the quartiles for each distribution.
  • Figure 4: The ROC and PRC curves for object hallucination detection of our method and the baselines for LLaVA-1.5-7B on MSCOCO dataset. Dashed line indicates chance performance.
  • Figure 5: Comparison of different layers' performance in detecting object hallucination on the MSCOCO dataset.
  • ...and 1 more figures