Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Soochul Park; Yeon Ju Lee

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Soochul Park, Yeon Ju Lee

TL;DR

Dual-Solver is introduced, which generalizes multistep samplers through learnable parameters that continuously interpolate among prediction types, select the integration domain, and adjust the residual terms to improve FID and CLIP scores in the low-NFE regime.

Abstract

Diffusion models achieve state-of-the-art image quality. However, sampling is costly at inference time because it requires a large number of function evaluations (NFEs). To reduce NFEs, classical ODE numerical methods have been adopted. Yet, the choice of prediction type and integration domain leads to different sampling behaviors. To address these issues, we introduce Dual-Solver, which generalizes multistep samplers through learnable parameters that continuously (i) interpolate among prediction types, (ii) select the integration domain, and (iii) adjust the residual terms. It retains the standard predictor-corrector structure while preserving second-order local accuracy. These parameters are learned via a classification-based objective using a frozen pretrained classifier (e.g., MobileNet or CLIP). For ImageNet class-conditional generation (DiT, GM-DiT) and text-to-image generation (SANA, PixArt-$α$), Dual-Solver improves FID and CLIP scores in the low-NFE regime ($3 \le$ NFE $\le 9$) across backbones.

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

TL;DR

Abstract

), Dual-Solver improves FID and CLIP scores in the low-NFE regime (

NFE

) across backbones.

Paper Structure (65 sections, 3 theorems, 54 equations, 20 figures, 11 tables, 2 algorithms)

This paper contains 65 sections, 3 theorems, 54 equations, 20 figures, 11 tables, 2 algorithms.

Introduction
Preliminaries
Diffusion Models
Training process.
Sampling process.
Prediction Types
Discretization discrepancy.
Dual-Solver
Dual Prediction with Parameter $\gamma$
Integral form of dual prediction.
Log-Linear Domain Change with Parameter $\tau$
Domain change.
Log-linear transform.
Approximation with Parameter $\kappa$
Implementation Details
...and 50 more sections

Key Result

Theorem C.1

Assume that ${\bm{x}}_\theta(u)$ and $\bm\epsilon_\theta(v)$ are $C^1$ on $[u_i,u_{i+1}]$ and $[v_i,v_{i+1}]$, respectively. Let ${\bm{x}}^{\text{exact}}_{t_{i+1}}$ denote the exact update in equation eq:invertible_transform_integral, and let ${\bm{x}}^{\text{1st-pred.}}_{t_{i+1}}$ denote the first–

Figures (20)

Figure 1: Sampling results. SANA xie2024sana, NFE=3, CFG=4.5. See Fig. \ref{['fig:additional_sana']} for further results.
Figure 2: Euler updates for noise, velocity, and data predictions.
Figure 3: Learned parameters. Values of $\{\gamma,\tau_u,\tau_v,\kappa_u,\kappa_v\}$ across sampling steps, learned on DiT peebles2023scalable at NFE = 5. See Figs. \ref{['fig:learned_dit']}, \ref{['fig:learned_gmdit']}, \ref{['fig:learned_sana']}, and \ref{['fig:learned_pixart']} for further results.
Figure 4: Solver parameter learning methods. It schematically illustrates trajectory, sample, and feature regression, as well as soft- and hard-label classification methods.
Figure 5: Main quantitative results. FID and CLIP score; evaluated on 50k (DiT/GM-DiT) and 30k (SANA/PixArt-$\alpha$) samples; CFG: DiT=1.5, GM-DiT=1.4, SANA=4.5, PixArt-$\alpha$=3.5.
...and 15 more figures

Theorems & Definitions (8)

Theorem C.1
proof
Theorem C.2
proof
Theorem C.3
proof
Definition G.1: Linear interpolation
Definition G.2: Averaged linear interpolation

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

TL;DR

Abstract

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (20)

Theorems & Definitions (8)