Table of Contents
Fetching ...

Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

Le Yang, Ziwei Zheng, Boxu Chen, Zhengyu Zhao, Chenhao Lin, Chao Shen

TL;DR

This work tackles object hallucination in large vision-language models by introducing HalluSpace, a low-rank subspace that captures differences between truthful and hallucinated features. The proposed method Nullu edits model weights by projecting them onto the null space of HalluSpace, effectively suppressing untruthful priors without increasing inference cost. The pipeline relies on paired data of truthful and hallucinated prompts to compute layer-wise difference matrices and extract a top-$k$ HalluSpace via SVD, which then informs the weight edits ${\bm{W}}_\ell^{ed}$. Empirical results across multiple LVLM families and benchmarks (CHAIR, POPE, MME) show consistent OH reduction without sacrificing general performance, and the paper discusses connections to DPO and LLM priors, with code released at GitHub.

Abstract

Recent studies have shown that large vision-language models (LVLMs) often suffer from the issue of object hallucinations (OH). To mitigate this issue, we introduce an efficient method that edits the model weights based on an unsafe subspace, which we call HalluSpace in this paper. With truthful and hallucinated text prompts accompanying the visual content as inputs, the HalluSpace can be identified by extracting the hallucinated embedding features and removing the truthful representations in LVLMs. By orthogonalizing the model weights, input features will be projected into the Null space of the HalluSpace to reduce OH, based on which we name our method Nullu. We reveal that HalluSpaces generally contain prior information in the large language models (LLMs) applied to build LVLMs, which have been shown as essential causes of OH in previous studies. Therefore, null space projection suppresses the LLMs' priors to filter out the hallucinated features, resulting in contextually accurate outputs. Experiments show that our method can effectively mitigate OH across different LVLM families without extra inference costs and also show strong performance in general LVLM benchmarks. Code is released at https://github.com/Ziwei-Zheng/Nullu.

Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

TL;DR

This work tackles object hallucination in large vision-language models by introducing HalluSpace, a low-rank subspace that captures differences between truthful and hallucinated features. The proposed method Nullu edits model weights by projecting them onto the null space of HalluSpace, effectively suppressing untruthful priors without increasing inference cost. The pipeline relies on paired data of truthful and hallucinated prompts to compute layer-wise difference matrices and extract a top- HalluSpace via SVD, which then informs the weight edits . Empirical results across multiple LVLM families and benchmarks (CHAIR, POPE, MME) show consistent OH reduction without sacrificing general performance, and the paper discusses connections to DPO and LLM priors, with code released at GitHub.

Abstract

Recent studies have shown that large vision-language models (LVLMs) often suffer from the issue of object hallucinations (OH). To mitigate this issue, we introduce an efficient method that edits the model weights based on an unsafe subspace, which we call HalluSpace in this paper. With truthful and hallucinated text prompts accompanying the visual content as inputs, the HalluSpace can be identified by extracting the hallucinated embedding features and removing the truthful representations in LVLMs. By orthogonalizing the model weights, input features will be projected into the Null space of the HalluSpace to reduce OH, based on which we name our method Nullu. We reveal that HalluSpaces generally contain prior information in the large language models (LLMs) applied to build LVLMs, which have been shown as essential causes of OH in previous studies. Therefore, null space projection suppresses the LLMs' priors to filter out the hallucinated features, resulting in contextually accurate outputs. Experiments show that our method can effectively mitigate OH across different LVLM families without extra inference costs and also show strong performance in general LVLM benchmarks. Code is released at https://github.com/Ziwei-Zheng/Nullu.

Paper Structure

This paper contains 17 sections, 7 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: An illustration of Nullu. (a) In the editing phase, Nullu will ① Extract hidden features of truthful (T.) and hallucinated (H.) inputs. ② Explore a low-rank HalluSpace ${\bm{v}}$ in the feature space by contrasting the differences between T. and H. features. ③ Edit the model weights by projecting them to the null space of ${\bm{v}}$. (b) In the inference phase, using the edited weights equals to project input features into the safe subspace, away from the hallucinated areas, leading to contextually accurate outputs.
  • Figure 2: An overview of Nullu, which identifies the HalluSpaces to edit model weights for LVLMs. (a) The paired truthful and hallucinated samples. (b) Nullu first calculates the difference matrix of hidden features for the paired samples and then conducts the SVD to find the main directions of the difference as the HalluSpace. Then Nullu projects the original MLP's weights to the null space of the HalluSpace. This procedure will be repeated for a series of layers, $\{\ell\}$, in the LLM of an LVLM.
  • Figure 3: The relation between Nullu and other debiasing methods leng2024mitigatingzhang2024debiasing. The inference procedure of (a) ours and (b) VCD leng2024mitigating. (c) The statistics of the word frequency in outputs of Nullu (①), with LLM priors (②), LLaVA (③) and VCD (④).