Taming Object Hallucinations with Verified Atomic Confidence Estimation

Jiarui Liu; Weihao Xuan; Zhijing Jin; Mona Diab

Taming Object Hallucinations with Verified Atomic Confidence Estimation

Jiarui Liu, Weihao Xuan, Zhijing Jin, Mona Diab

TL;DR

TACO introduces a lightweight, four-stage framework to curb object hallucinations in multimodal language models by decomposing queries into atomic binary checks, paraphrasing them to improve robustness, and estimating confidence via self-consistency or self-confidence before refining answers with an LLM. It eliminates reliance on external vision experts by performing internal self-verification and calibrating certainty through either a black-box or gray-box aggregation of paraphrase responses. Across five benchmarks (POPE, MME, HallusionBench, AMBER, MM-Hal) and two state-of-the-art MLLMs (LLaVA-1.5-7B and CogVLM2), TACO consistently reduces hallucinations and improves confidence calibration, with self-confidence (gray-box) generally outperforming self-consistency (black-box). The work also analyzes bias reduction, the impact of query reformulations, and the limitations of handling negative questions, underscoring TACO’s practical potential for improving trustworthiness in multimodal perception tasks. Overall, TACO demonstrates that a self-verification, paraphrase-based calibration loop can meaningfully enhance the faithfulness of MLLM outputs without heavy reliance on external vision modules.

Abstract

Multimodal Large Language Models (MLLMs) often suffer from hallucinations, particularly errors in object existence, attributes, or relations, which undermine their reliability. We introduce TACO (Verified Atomic Confidence Estimation), a simple framework that mitigates hallucinations through self-verification and confidence calibration without relying on external vision experts. TACO decomposes responses into atomic queries, paraphrases them to reduce sensitivity to wording, and estimates confidence using self-consistency (black-box) or self-confidence (gray-box) aggregation, before refining answers with a language model. Experiments on five benchmarks (POPE, MME, HallusionBench, AMBER, and MM-Hal Bench) with two MLLMs (\texttt{LLaVA-1.5-7B} and \texttt{CogVLM2}) show that TACO consistently outperforms direct prompting and Visual Contrastive Decoding, reduces systematic biases, and improves confidence calibration, demonstrating its effectiveness in enhancing the faithfulness of MLLMs.

Taming Object Hallucinations with Verified Atomic Confidence Estimation

TL;DR

Abstract

Taming Object Hallucinations with Verified Atomic Confidence Estimation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)