TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling
Hyunmin Cho, Donghoon Ahn, Susung Hong, Jee Eun Kim, Seungryong Kim, Kyong Hwan Jin
TL;DR
This work tackles hallucinations in diffusion-based image synthesis by reframing inference-time guidance as a geometry-aware trajectory refinement. It introduces Tangential Amplifying Guidance (TAG), which decomposes the base sampling update into normal and tangential components relative to the current latent state and amplifies the tangential part by a factor $\eta \ge 1$, while preserving the radial (noise-schedule) term. Grounded in Tweedie’s identity and a first-order Taylor analysis, TAG provably increases the local log-likelihood gain and steers samples toward higher-density regions of the data manifold without retraining. Empirically, TAG improves FID, IS, and CLIP-based metrics across unconditional and conditional generation, across multiple backbones (including SD v1.5, v2.1, XL, and SD3) and even flow-matching, while reducing compute via fewer NFEs. The method is plug-and-play and architecture-agnostic, offering a practical, low-overhead path to more faithful, hallucination-resistant diffusion sampling.
Abstract
Recent diffusion models achieve the state-of-the-art performance in image generation, but often suffer from semantic inconsistencies or hallucinations. While various inference-time guidance methods can enhance generation, they often operate indirectly by relying on external signals or architectural modifications, which introduces additional computational overhead. In this paper, we propose Tangential Amplifying Guidance (TAG), a more efficient and direct guidance method that operates solely on trajectory signals without modifying the underlying diffusion model. TAG leverages an intermediate sample as a projection basis and amplifies the tangential components of the estimated scores with respect to this basis to correct the sampling trajectory. We formalize this guidance process by leveraging a first-order Taylor expansion, which demonstrates that amplifying the tangential component steers the state toward higher-probability regions, thereby reducing inconsistencies and enhancing sample quality. TAG is a plug-and-play, architecture-agnostic module that improves diffusion sampling fidelity with minimal computational addition, offering a new perspective on diffusion guidance.
