Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
Minshuo Chen, Kaixuan Huang, Tuo Zhao, Mengdi Wang
TL;DR
The paper provides a principled theory for diffusion models when data lie on an unknown low-dimensional linear subspace. It introduces an encoder-decoder score network that achieves universal L2 approximation of the score, and proves sample-efficient score estimation with rates depending on the intrinsic dimension rather than ambient dimension. By analyzing the backward diffusion in the latent subspace and leveraging Girsanov’s theorem, the authors establish distribution-estimation guarantees, including subspace recovery and controlled convergence to the latent data distribution while the orthogonal component vanishes. The results demonstrate that diffusion models can circumvent the ambient-dimensionality curse and effectively capture intrinsic geometric structure through end-to-end learning. The framework lays groundwork for extending diffusion theory to broader manifold settings and motivates end-to-end subspace-aware generative modeling.
Abstract
Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Furthermore, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on the subspace dimension, indicating that diffusion models can circumvent the curse of data ambient dimensionality.
