Table of Contents
Fetching ...

Mitigating Hallucination in Multimodal Reasoning via Functional Attention Control

Haolang Lu, Bolun Chu, WeiYe Fu, Guoshun Nan, Junning Liu, Minghui Pan, Qiankun Li, Yi Yu, Hua Wang, Kun Wang

TL;DR

The paper addresses hallucination in multimodal reasoning models by identifying a perception–then–reasoning dynamic and two root failure modes: perceptual bias in early layers and reasoning drift in deeper layers. It introduces a lightweight two-step plugin—Functional Head Identification and Class-conditioned Rescaling—that identifies perception- and reasoning-oriented attention heads and selectively amplifies their contributions without retraining. Across three real-world MLRMs and six benchmarks, the approach yields consistent performance gains with minimal computational overhead, delivering balanced improvements on both perception- and reasoning-heavy tasks and improving interpretability. The method is model-agnostic and plug-and-play, offering a practical path toward safer, more reliable multimodal reasoning in high-stakes applications.

Abstract

Multimodal large reasoning models (MLRMs) are rapidly advancing vision-language reasoning and are emerging as a foundation for cross-modal intelligence. Hallucination remains a persistent failure mode, manifesting itself as erroneous reasoning chains and misinterpretation of visual content. In this study, we observe that attention heads exhibit a staged division: shallow heads predominantly serve perception, while deeper heads shift toward symbolic reasoning, revealing two major causes of hallucination, namely perceptual bias and reasoning drift. To address these issues, we propose a lightweight and interpretable two-step plugin, Functional Head Identification and Class-conditioned Rescaling, which locates perception- and reasoning-oriented heads and regulates their contributions without retraining. Evaluations on three real-world MLRMs (Kimi-VL, Ocean-R1, R1-Onevision), six benchmarks across three domains, and four baselines show that our plugin achieves an average improvement of 5% and up to 15%, with only <1% additional computation and 9% of baseline latency. Our approach is completely model-agnostic and significantly enhances both the reliability and interpretability of the off-the-shelf MLRMs, thereby enabling their safe deployment in high-stakes applications. Our code is available at https://anonymous.4open.science/r/Functional-Attention-Control.

Mitigating Hallucination in Multimodal Reasoning via Functional Attention Control

TL;DR

The paper addresses hallucination in multimodal reasoning models by identifying a perception–then–reasoning dynamic and two root failure modes: perceptual bias in early layers and reasoning drift in deeper layers. It introduces a lightweight two-step plugin—Functional Head Identification and Class-conditioned Rescaling—that identifies perception- and reasoning-oriented attention heads and selectively amplifies their contributions without retraining. Across three real-world MLRMs and six benchmarks, the approach yields consistent performance gains with minimal computational overhead, delivering balanced improvements on both perception- and reasoning-heavy tasks and improving interpretability. The method is model-agnostic and plug-and-play, offering a practical path toward safer, more reliable multimodal reasoning in high-stakes applications.

Abstract

Multimodal large reasoning models (MLRMs) are rapidly advancing vision-language reasoning and are emerging as a foundation for cross-modal intelligence. Hallucination remains a persistent failure mode, manifesting itself as erroneous reasoning chains and misinterpretation of visual content. In this study, we observe that attention heads exhibit a staged division: shallow heads predominantly serve perception, while deeper heads shift toward symbolic reasoning, revealing two major causes of hallucination, namely perceptual bias and reasoning drift. To address these issues, we propose a lightweight and interpretable two-step plugin, Functional Head Identification and Class-conditioned Rescaling, which locates perception- and reasoning-oriented heads and regulates their contributions without retraining. Evaluations on three real-world MLRMs (Kimi-VL, Ocean-R1, R1-Onevision), six benchmarks across three domains, and four baselines show that our plugin achieves an average improvement of 5% and up to 15%, with only <1% additional computation and 9% of baseline latency. Our approach is completely model-agnostic and significantly enhances both the reliability and interpretability of the off-the-shelf MLRMs, thereby enabling their safe deployment in high-stakes applications. Our code is available at https://anonymous.4open.science/r/Functional-Attention-Control.

Paper Structure

This paper contains 49 sections, 47 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Two hallucination examples corresponding to perception layers (Cause I) and reasoning layers (Cause II). The figure highlights attention patterns over text and image tokens, and the contribution of tokens at the first position where hallucination emerges.
  • Figure 2: Overall architecture. The attention module is edited by computing the Visual Attention Ratio, which determines the rescaling of specific heads. This process promotes more effective heads to become dominating heads, guiding the output toward correct perception and reasoning. The softmax function is not explicitly shown in the figure.
  • Figure 3: Efficiency comparison. Our method achieves the best efficiency while simultaneously improving accuracy. The x-axis reports $({\text{Acc}}/{\text{AvgBatchTime}})^2$ computed over 200 samples from HallusionBench with Kimi-VL.
  • Figure 4: Boundary sweep on Ocean-R1. We fix four hyperparameters ($g_{\mathrm{reas}}{=}1.3$, $\tau_{\mathrm{reas}}{=}0.01$, $g_{\mathrm{perc}}{=}1.16$, $\tau_{\mathrm{perc}}{=}0.22$) and vary the layer boundaries to assess performance sensitivity.
  • Figure 5: Analysis of multiplicative gains and ratio thresholds. On Ocean-R1 with fixed boundaries $\ell_{\mathrm{reas}}{=}3$, $\ell_{\mathrm{perc}}{=}7$: Left—performance on three datasets under varying multiplicative gains $(g_{\mathrm{reas}}, g_{\mathrm{perc}})$; Middle—impact of ratio thresholds $(\tau_{\mathrm{reas}}, \tau_{\mathrm{perc}})$ on performance; Right—number of identified heads as a function of $(\tau_{\mathrm{reas}}, \tau_{\mathrm{perc}})$ (small horizontal jitters are added for visual clarity). Default settings are $g_{\mathrm{reas}}{=}1.30$, $\tau_{\mathrm{reas}}{=}0.01$, $g_{\mathrm{perc}}{=}1.16$, $\tau_{\mathrm{perc}}{=}0.22$.
  • ...and 4 more figures