An analysis of the noise schedule for score-based generative models
Stanislas Strasman, Antonio Ocello, Claire Boyer, Sylvain Le Corff, Vincent Lemaire
TL;DR
The paper tackles the problem of how time-inhomogeneous noise schedules affect score-based diffusion models (SGMs). It introduces a unified forward-backward diffusion framework with a parametric noise schedule and derives a non-asymptotic bound on the KL divergence that explicitly depends on the schedule, plus a refined Wasserstein bound under Lipschitz and strong-log-concavity assumptions. The authors show that incorporating backward contraction improves mixing-time errors and provide numerical experiments on Gaussian targets and CIFAR-10 to guide schedule design, including a parametric schedule that often outperforms standard linear or cosine schedules. They also extend the analysis to non-Gaussian targets via Wasserstein-based metrics and demonstrate practical data preprocessing strategies to tighten the bounds and enhance generation quality. Overall, the work offers theoretical and empirical guidance for selecting and tuning noise schedules to improve SGMs in both simple and complex data settings, with publicly available code for reproducibility.
Abstract
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target.Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Under additional regularity assumptions, taking advantage of favorable underlying contraction mechanisms, we provide a tighter error bound in Wasserstein distance compared to state-of-the-art results. In addition to being tractable, this upper bound jointly incorporates properties of the target distribution and SGM hyperparameters that need to be tuned during training. Finally, we illustrate these bounds through numerical experiments using simulated and CIFAR-10 datasets, identifying an optimal range of noise schedules within a parametric family.
