On the Interpolation Effect of Score Smoothing in Diffusion Models
Zhengdao Chen
TL;DR
The paper examines how score smoothing, induced by regularization in neural score estimators, biases diffusion-model denoising toward interpolation along a training subspace rather than memorization. It develops a theoretical model in 1D showing that regularized two-layer ReLU networks learn a Smoothed PL-ESF, with a variational justification that near-minimizers have $\delta_t \propto \sqrt{t}$, and derives analytic flow dynamics for the resulting denoising process. Extending to higher dimensions, the analysis reveals a tangent-normal decomposition where smoothing preserves interpolation along the subspace while normal directions shrink, enabling subspace recovery without memorization and contrasting with naive early stopping. Numerical experiments corroborate the interpolation effect, showing NN-learned SF closely matches Smoothed PL-ESF and yields interpolating samples on linear and circular manifolds, even with implicit regularization. Overall, the work provides a mechanistic view of how score smoothing under NN training can endow diffusion models with generalization and creativity beyond the training data, guiding future design of score estimators and regularization strategies.
Abstract
Score-based diffusion models have achieved remarkable progress in various domains with the ability to generate new data samples that do not exist in the training set. In this work, we study the hypothesis that such creativity arises from an interpolation effect caused by a smoothing of the empirical score function. Focusing on settings where the training set lies uniformly in a one-dimensional subspace, we show theoretically how regularized two-layer ReLU neural networks tend to learn approximately a smoothed version of the empirical score function, and further probe the interplay between score smoothing and the denoising dynamics with analytical solutions and numerical experiments. In particular, we demonstrate how a smoothed score function can lead to the generation of samples that interpolate the training data along their subspace while avoiding full memorization. Moreover, we present experimental evidence that learning score functions with neural networks indeed induces a score smoothing effect, including in simple nonlinear settings and without explicit regularization.
