Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Trevine Oorloff, Yaser Yacoob, Abhinav Shrivastava
TL;DR
The paper tackles hallucinations in unconditional diffusion models by introducing Adaptive Attention Modulation (AAM), which dynamically tunes the self-attention sharpness during denoising and supplements it with masked perturbations to suppress early anomalous regions. By leveraging an inference-time optimization guided by a PatchCore anomaly signal and a memory bank built from the denoising UNet, the approach achieves robust reductions in hallucinations and improvements in FID across multiple datasets. Key contributions include identifying the critical role of early denoising attention, proposing a practical adaptive temperature framework, and validating the method with extensive ablations that demonstrate additive gains from each component. The proposed technique offers a practical route to more faithful unconditional diffusion outputs with improved reliability for downstream applications.
Abstract
Diffusion models, while increasingly adept at generating realistic images, are notably hindered by hallucinations -- unrealistic or incorrect features inconsistent with the trained data distribution. In this work, we propose Adaptive Attention Modulation (AAM), a novel approach to mitigate hallucinations by analyzing and modulating the self-attention mechanism in diffusion models. We hypothesize that self-attention during early denoising steps may inadvertently amplify or suppress features, contributing to hallucinations. To counter this, AAM introduces a temperature scaling mechanism within the softmax operation of the self-attention layers, dynamically modulating the attention distribution during inference. Additionally, AAM employs a masked perturbation technique to disrupt early-stage noise that may otherwise propagate into later stages as hallucinations. Extensive experiments demonstrate that AAM effectively reduces hallucinatory artifacts, enhancing both the fidelity and reliability of generated images. For instance, the proposed approach improves the FID score by 20.8% and reduces the percentage of hallucinated images by 12.9% (in absolute terms) on the Hands dataset.
