Table of Contents
Fetching ...

VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation

Ruiyang Zhang, Hu Zhang, Zhedong Zheng

TL;DR

VL-Uncertainty introduces an intrinsic uncertainty-based approach to detect hallucinations in large vision-language models by applying semantic-equivalent perturbations to both visual (blur) and textual prompts (LLM rephrasing). It measures uncertainty through semantic-meaning clustering and entropy of the resulting answer distribution, enabling a continuous, threshold-free signal for hallucination detection. Across 10 LVLMs and 4 benchmarks, the method consistently outperforms strong baselines, with ablations showing the importance of perturbation design and semantic equivalence. The approach is scalable, annotation-free, and suited to safety-critical deployments where unknown-domain problems arise.

Abstract

Given the higher information load processed by large vision-language models (LVLMs) compared to single-modal LLMs, detecting LVLM hallucinations requires more human and time expense, and thus rise a wider safety concerns. In this paper, we introduce VL-Uncertainty, the first uncertainty-based framework for detecting hallucinations in LVLMs. Different from most existing methods that require ground-truth or pseudo annotations, VL-Uncertainty utilizes uncertainty as an intrinsic metric. We measure uncertainty by analyzing the prediction variance across semantically equivalent but perturbed prompts, including visual and textual data. When LVLMs are highly confident, they provide consistent responses to semantically equivalent queries. However, when uncertain, the responses of the target LVLM become more random. Considering semantically similar answers with different wordings, we cluster LVLM responses based on their semantic content and then calculate the cluster distribution entropy as the uncertainty measure to detect hallucination. Our extensive experiments on 10 LVLMs across four benchmarks, covering both free-form and multi-choice tasks, show that VL-Uncertainty significantly outperforms strong baseline methods in hallucination detection.

VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation

TL;DR

VL-Uncertainty introduces an intrinsic uncertainty-based approach to detect hallucinations in large vision-language models by applying semantic-equivalent perturbations to both visual (blur) and textual prompts (LLM rephrasing). It measures uncertainty through semantic-meaning clustering and entropy of the resulting answer distribution, enabling a continuous, threshold-free signal for hallucination detection. Across 10 LVLMs and 4 benchmarks, the method consistently outperforms strong baselines, with ablations showing the importance of perturbation design and semantic equivalence. The approach is scalable, annotation-free, and suited to safety-critical deployments where unknown-domain problems arise.

Abstract

Given the higher information load processed by large vision-language models (LVLMs) compared to single-modal LLMs, detecting LVLM hallucinations requires more human and time expense, and thus rise a wider safety concerns. In this paper, we introduce VL-Uncertainty, the first uncertainty-based framework for detecting hallucinations in LVLMs. Different from most existing methods that require ground-truth or pseudo annotations, VL-Uncertainty utilizes uncertainty as an intrinsic metric. We measure uncertainty by analyzing the prediction variance across semantically equivalent but perturbed prompts, including visual and textual data. When LVLMs are highly confident, they provide consistent responses to semantically equivalent queries. However, when uncertain, the responses of the target LVLM become more random. Considering semantically similar answers with different wordings, we cluster LVLM responses based on their semantic content and then calculate the cluster distribution entropy as the uncertainty measure to detect hallucination. Our extensive experiments on 10 LVLMs across four benchmarks, covering both free-form and multi-choice tasks, show that VL-Uncertainty significantly outperforms strong baseline methods in hallucination detection.

Paper Structure

This paper contains 16 sections, 3 equations, 21 figures, 13 tables.

Figures (21)

  • Figure 1: Our motivation. External evaluator-based methods usually suffer from knowledge missing when it comes to new domains (see (a)). In contrast, our VL-Uncertainty elicits intrinsic uncertainty of LVLM through proposed semantic-equivalent perturbation. Finally, refined uncertainty estimation facilitates reliable LVLM hallucination detection (see (b)).
  • Figure 2: Comparison between semantic-equivalent perturbations and inequivalent ones. LVLMs inevitably generate hallucinatory answers (see (a)). While semantic-inequivalent perturbations yield correct answers, they do not provide insight into the uncertainty of LVLM for the original query, as shown in (b). In contrast, responses to semantically equivalent perturbed prompts, though potentially incorrect, offer valuable insight into the intrinsic uncertainty of LVLM. With only the exterior presentations of prompt altered, fluctuation of answers indicates elevated uncertainty (see (c)). This distinction highlights the utility of semantic-equivalent perturbations in assessing the reliability and consistency of LVLM responses.
  • Figure 3: Overall illustration of our proposed VL-Uncertainty. To facilitate mining of uncertainty arising from various modalities, we apply semantic-equivalent perturbations (left) to both visual and textual prompts. For visual prompt, the original image is blurred to varying degrees, mimicking human visual perception. For textual prompt, pre-trained LLM is prompted to rephrase the original question in semantic-equivalent manner with different temperatures. Detailed instruction is designed to achieve question rephrasing with the original semantic preserved. Prompt pairs with varying degrees of perturbation are harnessed to effectively elicit LVLM uncertainty. We cluster LVLM answer set by semantic meaning and utilize entropy of answer cluster distribution as LVLM uncertainty (right). The estimated uncertainty serves as a continuous indicator of different levels of LVLM hallucination.
  • Figure 4: Qualitative comparison between VL-Uncertainty and baselines. We present a sample from free-form benchmark. For this hallucinatory sample, pseudo-annotation-based method liu2023mitigating fails to interpret the hidden-behind logic and thus misses detecting hallucination (see (a)). On the other hand, for semantic-entropy farquhar2024detecting, vanilla multi-sampling proves ineffective for mining LVLM uncertainty (see (b)). In contrast, our proposed semantic-equivalent perturbation on both visual and textual prompts successfully elicits LVLM uncertainty. This refined uncertainty estimation enhances the successful detection of LVLM hallucination (see (c)).
  • Figure 5: Uncertainty distribution for hallucinatory and non-hallucinatory LVLM answers on MMVet. Our VL-Uncertainty accurately assigns high uncertainty to hallucinatory answers and low uncertainty to non-hallucinatory answers. This distinct uncertainty distribution gap facilitates LVLM hallucination detection.
  • ...and 16 more figures