Elucidating the solution space of extended reverse-time SDE for diffusion models
Qinpeng Cui, Xinyi Zhang, Qiqi Bao, Qingmin Liao
TL;DR
This paper addresses the speed–quality dilemma in diffusion-model sampling by unifying ODE and SDE approaches under an Extended Reverse-Time SDE (ER SDE) framework. It uncovers a semi-linear structure that yields exact solutions for VE SDE and practical approximations for VP SDE, and introduces the concept of one-step prediction errors to explain why ODE solvers excel in low-NFE regimes while SDE solvers excel as NFE grows. By exploiting the ER SDE solution space through carefully chosen noise-scale functions $\phi(\cdot)$, the authors design ER-SDE-Solvers that realize rapid, high-quality sampling and demonstrate state-of-the-art performance among training-free stochastic samplers (e.g., on ImageNet $128\times128$ with $NFE=20$). The results show a practical path to deploy fast yet high-fidelity diffusion-based generation, with classifier guidance further boosting efficiency at higher resolutions. This work advances both theory and practice by connecting ODE/SDE dynamics, enabling versatile solvers, and offering concrete guidelines for noise-schedule design in large-scale diffusion models.
Abstract
Sampling from Diffusion Models can alternatively be seen as solving differential equations, where there is a challenge in balancing speed and image visual quality. ODE-based samplers offer rapid sampling time but reach a performance limit, whereas SDE-based samplers achieve superior quality, albeit with longer iterations. In this work, we formulate the sampling process as an Extended Reverse-Time SDE (ER SDE), unifying prior explorations into ODEs and SDEs. Theoretically, leveraging the semi-linear structure of ER SDE solutions, we offer exact solutions and approximate solutions for VP SDE and VE SDE, respectively. Based on the approximate solution space of the ER SDE, referred to as one-step prediction errors, we yield mathematical insights elucidating the rapid sampling capability of ODE solvers and the high-quality sampling ability of SDE solvers. Additionally, we unveil that VP SDE solvers stand on par with their VE SDE counterparts. Based on these findings, leveraging the dual advantages of ODE solvers and SDE solvers, we devise efficient high-quality samplers, namely ER-SDE-Solvers. Experimental results demonstrate that ER-SDE-Solvers achieve state-of-the-art performance across all stochastic samplers while maintaining efficiency of deterministic samplers. Specifically, on the ImageNet $128\times128$ dataset, ER-SDE-Solvers obtain 8.33 FID in only 20 function evaluations. Code is available at \href{https://github.com/QinpengCui/ER-SDE-Solver}{https://github.com/QinpengCui/ER-SDE-Solver}
