Table of Contents
Fetching ...

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

Zhengtao Zou, Ya Gao, Jiarui Guan, Bin Li, Pekka Marttinen

TL;DR

Extensive experiments indicate that RUDDER achieves performance comparable to state-of-the-art methods while introducing negligible computational latency, validating RUDDER as a pragmatic and effective approach for improving LVLMs'reliability without a significant compromise on efficiency.

Abstract

Large Vision-Language Models (LVLMs) often suffer from object hallucination, generating text inconsistent with visual inputs, which can critically undermine their reliability. Existing inference-time interventions to mitigate this issue present a challenging trade-off: while methods that steer internal states or adjust output logits can be effective, they often incur substantial computational overhead, typically requiring extra forward passes. This efficiency bottleneck can limit their practicality for real-world, latency-sensitive deployments. In this work, we aim to address this trade-off with Residual-Update Directed DEcoding Regulation (RUDDER), a low-overhead framework that steers LVLMs towards visually-grounded generation. RUDDER is built on two key innovations: (1) Contextual Activation Residual Direction (CARD) vector, a per-sample visual evidence vector extracted from the residual update of a self-attention layer during a single, standard forward pass. (2) A Bayesian-inspired adaptive gate that performs token-wise injection, applying a corrective signal whose strength is conditioned on the model's deviation from the visual context. Extensive experiments on key hallucination benchmarks, including POPE and CHAIR, indicate that RUDDER achieves performance comparable to state-of-the-art methods while introducing negligible computational latency, validating RUDDER as a pragmatic and effective approach for improving LVLMs' reliability without a significant compromise on efficiency.

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models

TL;DR

Extensive experiments indicate that RUDDER achieves performance comparable to state-of-the-art methods while introducing negligible computational latency, validating RUDDER as a pragmatic and effective approach for improving LVLMs'reliability without a significant compromise on efficiency.

Abstract

Large Vision-Language Models (LVLMs) often suffer from object hallucination, generating text inconsistent with visual inputs, which can critically undermine their reliability. Existing inference-time interventions to mitigate this issue present a challenging trade-off: while methods that steer internal states or adjust output logits can be effective, they often incur substantial computational overhead, typically requiring extra forward passes. This efficiency bottleneck can limit their practicality for real-world, latency-sensitive deployments. In this work, we aim to address this trade-off with Residual-Update Directed DEcoding Regulation (RUDDER), a low-overhead framework that steers LVLMs towards visually-grounded generation. RUDDER is built on two key innovations: (1) Contextual Activation Residual Direction (CARD) vector, a per-sample visual evidence vector extracted from the residual update of a self-attention layer during a single, standard forward pass. (2) A Bayesian-inspired adaptive gate that performs token-wise injection, applying a corrective signal whose strength is conditioned on the model's deviation from the visual context. Extensive experiments on key hallucination benchmarks, including POPE and CHAIR, indicate that RUDDER achieves performance comparable to state-of-the-art methods while introducing negligible computational latency, validating RUDDER as a pragmatic and effective approach for improving LVLMs' reliability without a significant compromise on efficiency.

Paper Structure

This paper contains 40 sections, 11 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: (Left) An example where the vanilla LLaVA-1.5--7B liu_improved_2024 hallucinates objects. Erroneous text is marked in red, while RUDDER's corrected, factual output is in blue. (Right) A comparison showing that unlike existing non-steering and steering-based methods, RUDDER provides adaptive, low-overhead control without requiring extra forward passes.
  • Figure 2: The overall workflow of RUDDER. Our method operates in two stages. (1) Prefill Stage (Yellow Arrows): We extract CARD vector $\mathbf{v}_{\mathrm{CARD}}$ by first collecting attention-induced residual updates $\Delta_i^{l}$ from a target layer $l$ for each token $i$ in the prefill span. These updates are then aggregated using pooling and normalization. The final CARD vectors are cached for each (image, prompt) pair. (2) Decoding Stage (Orange Arrows): When generating each answer token $t$, the adaptive Beta Gate computes a steering vector $\mathbf{v}^{\text{steer}}_t$, which is then injected into the residual stream to guide the LVLM towards a more visually-grounded output.
  • Figure 3: Ablation study of RUDDER's hyperparameters on the Idefics2laurenccon2024matters model. (a) The bar plot shows the impact of the intervention layer L. (b-d) The heatmaps analyze the trade-off between steering strength $\alpha_\text{max}$ and gate sensitivity $k$, showing their effect on CHAIR scores and recall.
  • Figure 4: Structure in steering space (b) and its sample-wise projection to $\mathbf{v}_{\text{CARD}}$ (a).
  • Figure 5: Directional evidence with reflowed layout. (a) Consistent $\mathbf{v}_{\text{text}}\!\rightarrow\!\mathbf{v}_{\text{img+txt}}$ rotation; (b) positive alignment gain to $\mathbf{v}^{\text{steer}}$; (c,d) cluster-wise stability; (e) systematic gate differences.
  • ...and 7 more figures