Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions
Kaihong Zhang, Caitlyn H. Yin, Feng Liang, Jingbo Liu
TL;DR
This work analyzes score-based diffusion model sampling in a nonparametric, large-sample regime, removing the need for density lower bounds by assuming only sub-Gaussian data. It introduces a truncated kernel density estimator to construct a time-varying score estimator with a bound that improves as time increases, and couples this with an early stopping time to control low-density regions. Using Girsanov's theorem, the authors translate score-estimation errors into total-variation guarantees for the generated data, showing that for sub-Gaussian ground truth the diffusion sampler achieves a rate of polylog(n) n^{-1/2} t^{-d/4} in TV; when the true density lies in a Sobolev class with β ≤ 2 and with an appropriately chosen t0, the rate improves to near the classical minimax rate n^{-β/(2β+d)} (up to polylog factors). This establishes nearly minimax optimal sampling performance without the restrictive density-lower-bound assumptions of prior works, broadening the applicability of score-based diffusion models in nonparametric settings.
Abstract
We study the asymptotic error of score-based diffusion model sampling in large-sample scenarios from a non-parametric statistics perspective. We show that a kernel-based score estimator achieves an optimal mean square error of $\widetilde{O}\left(n^{-1} t^{-\frac{d+2}{2}}(t^{\frac{d}{2}} \vee 1)\right)$ for the score function of $p_0*\mathcal{N}(0,t\boldsymbol{I}_d)$, where $n$ and $d$ represent the sample size and the dimension, $t$ is bounded above and below by polynomials of $n$, and $p_0$ is an arbitrary sub-Gaussian distribution. As a consequence, this yields an $\widetilde{O}\left(n^{-1/2} t^{-\frac{d}{4}}\right)$ upper bound for the total variation error of the distribution of the sample generated by the diffusion model under a mere sub-Gaussian assumption. If in addition, $p_0$ belongs to the nonparametric family of the $β$-Sobolev space with $β\le 2$, by adopting an early stopping strategy, we obtain that the diffusion model is nearly (up to log factors) minimax optimal. This removes the crucial lower bound assumption on $p_0$ in previous proofs of the minimax optimality of the diffusion model for nonparametric families.
