HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

Yilin Yang; Zhenghui Guo; Yuke Wang; Omprakash Gnawali; Sheng Di; Chengming Zhang

HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

Yilin Yang, Zhenghui Guo, Yuke Wang, Omprakash Gnawali, Sheng Di, Chengming Zhang

TL;DR

The paper tackles language-prior driven hallucinations in Vision-Language Models by proposing a three-part framework: generating Hallucination-Inducing Images (HIIs), constructing the Masked-Object-Hallucination (MOH) benchmark to quantify scene-conditioned hallucinations, and applying Direct Preference Optimization (DPO) on HIIs to align models more closely with visual grounding. HIIs are created through an object-detection and iterative-masking pipeline, filtered by model-specific DDG responses, and used to build fine-grained preference data that focus on hallucinated sentences. The results show state-of-the-art reductions in hallucination rates across multiple benchmarks and model scales (up to 38% improvement on standard hallucination benchmarks and up to 92% HR reduction in some tasks) while preserving general VQA capabilities. This approach provides a robust diagnostic tool (MOH) and a scalable alignment strategy (HII-DPO) to mitigate linguistic priors in multimodal systems, with significant implications for deploying trustworthy AI in safety-critical domains.

Abstract

Large Vision-Language Models (VLMs) have achieved remarkable success across diverse multimodal tasks but remain vulnerable to hallucinations rooted in inherent language bias. Despite recent progress, existing hallucination mitigation methods often overlook the underlying hallucination patterns driven by language bias. In this work, we design a novel pipeline to accurately synthesize Hallucination-Inducing Images (HIIs). Using synthesized HIIs, we reveal a consistent scene-conditioned hallucination pattern: models tend to mention objects that are highly typical of the scene even when visual evidence is removed. To quantify the susceptibility of VLMs to this hallucination pattern, we establish the Masked-Object-Hallucination (MOH) benchmark to rigorously evaluate existing state-of-the-art alignment frameworks. Finally, we leverage HIIs to construct high-quality preference datasets for fine-grained alignment. Experimental results demonstrate that our approach effectively mitigates hallucinations while preserving general model capabilities. Specifically, our method achieves up to a 38% improvement over the current state-of-the-art on standard hallucination benchmarks.

HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

TL;DR

Abstract

Paper Structure (47 sections, 12 equations, 9 figures, 11 tables)

This paper contains 47 sections, 12 equations, 9 figures, 11 tables.

Introduction
Our Observation.
Our Contributions.
Preliminaries
Large Vision Language Model
Direct Preference Optimization
Method
Overview
Hallucination-Inducing Images (HIIs) Synthesis
Entity Detection.
Iterative Masking.
Model-Specific HII Filtering.
The Masked-Object-Hallucination Benchmark
Evaluation Tasks.
Scene Taxonomy.
...and 32 more sections

Figures (9)

Figure 1: Overview of our framework.(A) Synthesize HIIs using GroundingDINO and open-source VLMs. (B) Construct MOH benchmark to quantitatively evaluate the scene-conditioned hallucination pattern. (C) Generate preference dataset using HIIs.
Figure 2: Overview of the HII Synthesis Pipeline.(1): Potential entities are identified via GroundingDINO based on a predefined synonym dictionary. (2): A detection-masking cycle is employed to achieve complete occlusion of the target entity, yielding HII candidates. (3) Task the target VLM to perform DDG and retain only those images with HR $\ge$ 50%.
Figure 3: Masked-Object-Hallucination Benchmark.
Figure 4: Preference Dataset Generation. Utilizing curated model-specific HIIs, we construct contrastive response pairs where the chosen and rejected responses share an identical prefix. Verified by GroundingDINO, the rejected responses contain the hallucinated objects, whereas the chosen response describes only factual entities.
Figure 5: Distribution of the top-5 masked objects across ten environmental settings. Each subplot illustrates the percentage of specific objects that trigger hallucinations within the scenario. These statistics reveal the ingrained scene-conditioned hallucination pattern within VLMs; for instance, “boat” dominates in waterfront context, while “train” is the primary driver of hallucinations in the railroad setting.
...and 4 more figures

HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

TL;DR

Abstract

HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images

Authors

TL;DR

Abstract

Table of Contents

Figures (9)