Table of Contents
Fetching ...

Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance

Inho Kong, Sojin Lee, Youngjoon Hong, Hyunwoo J. Kim

TL;DR

Embedded Runge-Kutta Guidance (ERK-Guid), which exploits detected stiffness to reduce LTE and stabilize sampling and demonstrates that ERK-Guid consistently outperforms state-of-the-art methods.

Abstract

Classifier-Free Guidance (CFG) has established the foundation for guidance mechanisms in diffusion models, showing that well-designed guidance proxies significantly improve conditional generation and sample quality. Autoguidance (AG) has extended this idea, but it relies on an auxiliary network and leaves solver-induced errors unaddressed. In stiff regions, the ODE trajectory changes sharply, where local truncation error (LTE) becomes a critical factor that deteriorates sample quality. Our key observation is that these errors align with the dominant eigenvector, motivating us to leverage the solver-induced error as a guidance signal. We propose Embedded Runge-Kutta Guidance (ERK-Guid), which exploits detected stiffness to reduce LTE and stabilize sampling. We theoretically and empirically analyze stiffness and eigenvector estimators with solver errors to motivate the design of ERK-Guid. Our experiments on both synthetic datasets and the popular benchmark dataset, ImageNet, demonstrate that ERK-Guid consistently outperforms state-of-the-art methods. Code is available at https://github.com/mlvlab/ERK-Guid.

Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance

TL;DR

Embedded Runge-Kutta Guidance (ERK-Guid), which exploits detected stiffness to reduce LTE and stabilize sampling and demonstrates that ERK-Guid consistently outperforms state-of-the-art methods.

Abstract

Classifier-Free Guidance (CFG) has established the foundation for guidance mechanisms in diffusion models, showing that well-designed guidance proxies significantly improve conditional generation and sample quality. Autoguidance (AG) has extended this idea, but it relies on an auxiliary network and leaves solver-induced errors unaddressed. In stiff regions, the ODE trajectory changes sharply, where local truncation error (LTE) becomes a critical factor that deteriorates sample quality. Our key observation is that these errors align with the dominant eigenvector, motivating us to leverage the solver-induced error as a guidance signal. We propose Embedded Runge-Kutta Guidance (ERK-Guid), which exploits detected stiffness to reduce LTE and stabilize sampling. We theoretically and empirically analyze stiffness and eigenvector estimators with solver errors to motivate the design of ERK-Guid. Our experiments on both synthetic datasets and the popular benchmark dataset, ImageNet, demonstrate that ERK-Guid consistently outperforms state-of-the-art methods. Code is available at https://github.com/mlvlab/ERK-Guid.
Paper Structure (12 sections, 1 theorem, 18 equations, 5 figures, 4 tables)

This paper contains 12 sections, 1 theorem, 18 equations, 5 figures, 4 tables.

Key Result

Proposition 1

Let $J_{\sigma_i}$ be the Jacobian matrix of the drift function $\bm{f}(\mathbf{x}_{\sigma}; \sigma)$ evaluated at $\mathbf{x}_{\sigma_i}^{\mathrm{Heun}}$. Assume that $\bm{f}(\mathbf{x}_{\sigma}; \sigma)$ has a locally Lipschitz Jacobian near $\mathbf{x}_{\sigma_i}^{\mathrm{Heun}}$. If the ERK solu then the magnitude of the dominant eigenvalue, $|\lambda|$, admits the approximation

Figures (5)

  • Figure 1: Projection of local truncation error (LTE) and ERK solution difference onto eigenvector axes.
  • Figure 2: Eigenvector alignment across stiffness.(a) Cosine similarity between the dominant eigenvector and the local truncation error (LTE) increases with stiffness. (b) The ERK solution difference exhibits a similar trend to LTE, suggesting it can serve as a reliable proxy for the LTE direction in high stiffness regions. (c) Our ERK-Guid consistently achieves higher cosine similarity with the eigenvector, highlighting its strong alignment in stiff regions. CFG and Autoguidance exhibit weaker or mixed alignment with the dominant eigenvector in stiff regions, supporting the complementarity of our method.
  • Figure 3: Accuracy of proposed estimators.(a) Our estimated stiffness values highly correlate with the JVP-based one. (b) ERK drift difference (blue) maintains higher alignment with the dominant eigenvector than the ERK solution difference (orange), especially at high stiffness.
  • Figure 4: Grid search of hyperparameters. Quantitative trends of FID and FD-DINOv2 as $w_{\mathrm{con}}$ varies. Each curve corresponds to a different value of $w_{\mathrm{stiff}}$, with points indicating increasing $w_{\mathrm{con}}$, shown for (a) 16-step and (b) 32-step sampling.
  • Figure 5: Qualitative comparison on PixArt-$\alpha$chen2023pixart. We perform text-to-image generation to compare DPM-Solver with our ERK-Guid. As shown in the blue zoomed-in regions, ERK-Guid captures fine semantic details more accurately.

Theorems & Definitions (1)

  • Proposition 1