PAS : Prelim Attention Score for Detecting Object Hallucinations in Large Vision--Language Models
Nhat Hoang-Xuan, Minh Vu, My T. Thai, Manish Bhattarai
TL;DR
This paper tackles object hallucination in Large Vision-Language Models (LVLMs) by showing that hallucinations often arise when the model over-relies on prelim tokens rather than image content. It introduces PAS, a training-free detector that leverages attentional flow from prelim tokens to object tokens, and supports this with an information-theoretic interpretation based on mutual information; however, PAS offers a practical, on-the-fly alternative by using the attention signal rather than sampling across images. PAS computes a single scalar per object token, $s_{ ext{prel}}$, from layer $l$ attention heads, typically layer 0, enabling real-time detection with no extra forward passes. Experiments across LLaVA-1.5-7B, MiniGPT-4, and Shikra on MSCOCO and Pascal VOC show PAS achieving state-of-the-art AUROC compared to baselines such as NLL, Entropy, SVAR, IC, and GLSim, with low memory overhead. The work suggests that prelim overdependence marks an unstable, alternative operating mode of LVLMs and provides a practical tool for real-time filtering and intervention to improve reliability in LVLM deployments.
Abstract
Large vision-language models (LVLMs) are powerful, yet they remain unreliable due to object hallucinations. In this work, we show that in many hallucinatory predictions the LVLM effectively ignores the image and instead relies on previously generated output (prelim) tokens to infer new objects. We quantify this behavior via the mutual information between the image and the predicted object conditioned on the prelim, demonstrating that weak image dependence strongly correlates with hallucination. Building on this finding, we introduce the Prelim Attention Score (PAS), a lightweight, training-free signal computed from attention weights over prelim tokens. PAS requires no additional forward passes and can be computed on the fly during inference. Exploiting this previously overlooked signal, PAS achieves state-of-the-art object-hallucination detection across multiple models and datasets, enabling real-time filtering and intervention.
