S4S: Solving for a Diffusion Model Solver
Eric Frankel, Sitan Chen, Jerry Li, Pang Wei Koh, Lillian J. Ratliff, Sewoong Oh
TL;DR
This work tackles the inefficiency of diffusion-model sampling when few neural-function evaluations are allowed. It introduces S4S, a training-free method that distills from a high-NFE teacher solver to learn solver coefficients that better approximate the diffusion trajectory, yielding universal sample-quality gains without data or retraining. Building on this, S4S-Alt jointly optimizes solver coefficients and discretization steps via alternating minimization, achieving up to roughly a 1.5–2× improvement in FID across several datasets with as few as 5 NFEs. The approach is lightweight, black-box compatible, and data-free, offering a practical path to faster, high-quality diffusion sampling, with potential future extensions to SDEs and sample-level adaptation.
Abstract
Diffusion models (DMs) create samples from a data distribution by starting from random noise and iteratively solving a reverse-time ordinary differential equation (ODE). Because each step in the iterative solution requires an expensive neural function evaluation (NFE), there has been significant interest in approximately solving these diffusion ODEs with only a few NFEs without modifying the underlying model. However, in the few NFE regime, we observe that tracking the true ODE evolution is fundamentally impossible using traditional ODE solvers. In this work, we propose a new method that learns a good solver for the DM, which we call Solving for the Solver (S4S). S4S directly optimizes a solver to obtain good generation quality by learning to match the output of a strong teacher solver. We evaluate S4S on six different pre-trained DMs, including pixel-space and latent-space DMs for both conditional and unconditional sampling. In all settings, S4S uniformly improves the sample quality relative to traditional ODE solvers. Moreover, our method is lightweight, data-free, and can be plugged in black-box on top of any discretization schedule or architecture to improve performance. Building on top of this, we also propose S4S-Alt, which optimizes both the solver and the discretization schedule. By exploiting the full design space of DM solvers, with 5 NFEs, we achieve an FID of 3.73 on CIFAR10 and 13.26 on MS-COCO, representing a $1.5\times$ improvement over previous training-free ODE methods.
