Efficient Denoising using Score Embedding in Score-based Diffusion Models
Andrew S. Na, William Gao, Justin W. L. Wan
TL;DR
This work tackles the heavy training cost of denoising score-based diffusion models by pre-computing the score field through a numerical solution of the log-density Fokker-Planck equation, $m = \log p$, before training. The computed score is embedded into the image via the transport equation, providing a label-embedded input that guides learning under a slice Wasserstein objective, thereby reducing both epoch counts and the required amount of training data. A semi-explicit finite-difference scheme handles the nonlinearity of the log-density FP equation, with sparse Gaussian elimination accelerating the solve, and the approach is validated on CIFAR10, CelebA, and ImageNet, showing 3–5x training-time speedups while preserving image quality. The method offers a practical path to more energy-efficient and scalable diffusion-based denoising and generation, with future work extending the framework to videos and higher-dimensional densities.
Abstract
It is well known that training a denoising score-based diffusion models requires tens of thousands of epochs and a substantial number of image data to train the model. In this paper, we propose to increase the efficiency in training score-based diffusion models. Our method allows us to decrease the number of epochs needed to train the diffusion model. We accomplish this by solving the log-density Fokker-Planck (FP) Equation numerically to compute the score \textit{before} training. The pre-computed score is embedded into the image to encourage faster training under slice Wasserstein distance. Consequently, it also allows us to decrease the number of images we need to train the neural network to learn an accurate score. We demonstrate through our numerical experiments the improved performance of our proposed method compared to standard score-based diffusion models. Our proposed method achieves a similar quality to the standard method meaningfully faster.
