Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale
Candi Zheng, Yuan Lan
TL;DR
This work addresses non-linear deviations that arise in diffusion model guidance at large scales by introducing characteristic guidance (CH), a training-free correction grounded in the method of characteristics to enforce FP-consistent dynamics. CH constructs a nonlinear denoising function from two base networks with a nonlinear input perturbation and a fixed-point correction, leveraging a harmonic ansatz to enable analytic relationships and zero mixing error in the infinitesimal-step limit. The authors validate CH theoretically on Gaussian toy models and empirically across magnet phase transitions, CIFAR-10, ImageNet-256, and Stable Diffusion, showing improved semantic control, reduced color/exposure artifacts, and better sampling diversity, often with little or no loss in standard quality metrics. The approach is compatible with a wide range of samplers and data types, suggesting a practical path to robust high-guidance diffusion that preserves detail and alignment with conditional prompts in real-world applications.
Abstract
Popular guidance for denoising diffusion probabilistic model (DDPM) linearly combines distinct conditional models together to provide enhanced control over samples. However, this approach overlooks nonlinear effects that become significant when guidance scale is large. To address this issue, we propose characteristic guidance, a guidance method that provides first-principle non-linear correction for classifier-free guidance. Such correction forces the guided DDPMs to respect the Fokker-Planck (FP) equation of diffusion process, in a way that is training-free and compatible with existing sampling methods. Experiments show that characteristic guidance enhances semantic characteristics of prompts and mitigate irregularities in image generation, proving effective in diverse applications ranging from simulating magnet phase transitions to latent space sampling.
