Score-Optimal Diffusion Schedules
Christopher Williams, Andrew Campbell, Arnaud Doucet, Saifuddin Syed
TL;DR
The paper tackles the challenge of choosing discretisation schedules for score-based diffusion models by introducing score-optimal diffusion schedules that minimize a data-driven transport cost along the diffusion path. It frames the problem via a predictor–corrector decomposition and defines an incremental work measure using Stein/Fisher divergences, then shows the optimal schedule corresponds to a geodesic in a metric induced by the local cost. An online algorithm estimates the optimal schedule from estimated scores (and Hessians when using $\mathcal{L}_p$), with a velocity scaling $v(t)=\sigma(t)$ to align with diffusion dynamics. Empirically, the method recovers known performant schedules on image data, improves sampling efficiency, and enables online schedule learning during training across 1D and high-dimensional datasets, using pre-trained diffusion models as a testbed. The approach offers a hyperparameter-free, scalable way to tailor discretisation to data geometry, with practical impact on sample quality and speed for diffusion-based generative models.
Abstract
Denoising diffusion models (DDMs) offer a flexible framework for sampling from high dimensional data distributions. DDMs generate a path of probability distributions interpolating between a reference Gaussian distribution and a data distribution by incrementally injecting noise into the data. To numerically simulate the sampling process, a discretisation schedule from the reference back towards clean data must be chosen. An appropriate discretisation schedule is crucial to obtain high quality samples. However, beyond hand crafted heuristics, a general method for choosing this schedule remains elusive. This paper presents a novel algorithm for adaptively selecting an optimal discretisation schedule with respect to a cost that we derive. Our cost measures the work done by the simulation procedure to transport samples from one point in the diffusion path to the next. Our method does not require hyperparameter tuning and adapts to the dynamics and geometry of the diffusion path. Our algorithm only involves the evaluation of the estimated Stein score, making it scalable to existing pre-trained models at inference time and online during training. We find that our learned schedule recovers performant schedules previously only discovered through manual search and obtains competitive FID scores on image datasets.
