Improved Sample Complexity For Diffusion Model Training Without Empirical Risk Minimizer Access

Mudit Gaur; Prashant Trivedi; Sasidhar Kunapuli; Amrit Singh Bedi; Vaneet Aggarwal

Improved Sample Complexity For Diffusion Model Training Without Empirical Risk Minimizer Access

Mudit Gaur, Prashant Trivedi, Sasidhar Kunapuli, Amrit Singh Bedi, Vaneet Aggarwal

TL;DR

The paper tackles the sample complexity of training score-based diffusion models without access to exact empirical risk minimizers. By formalizing the forward–backward SDE framework, discretizing via a DDPM-style sequence, and decomposing score estimation error into approximation, statistical, and optimization components under a Polyak–Łojasiewicz–type condition, the authors derive a finite-sample bound that reduces to a total-variation target with a rate of $\tilde{\mathcal{O}}(\epsilon^{-4})$. This bound holds without ERM access and avoids exponential dependence on neural network parameters, marking a theoretical advance over prior results that required ERM. The results substantiate that diffusion models can achieve arbitrarily small distributional discrepancy with practical sample sizes, and they outline directions for extending guarantees to conditional generation and related setups.

Abstract

Diffusion models have demonstrated state-of-the-art performance across vision, language, and scientific domains. Despite their empirical success, prior theoretical analyses of the sample complexity suffer from poor scaling with input data dimension or rely on unrealistic assumptions such as access to exact empirical risk minimizers. In this work, we provide a principled analysis of score estimation, establishing a sample complexity bound of $\mathcal{O}(ε^{-4})$. Our approach leverages a structured decomposition of the score estimation error into statistical, approximation, and optimization errors, enabling us to eliminate the exponential dependence on neural network parameters that arises in prior analyses. It is the first such result that achieves sample complexity bounds without assuming access to the empirical risk minimizer of score function estimation loss.

Improved Sample Complexity For Diffusion Model Training Without Empirical Risk Minimizer Access

TL;DR

. This bound holds without ERM access and avoids exponential dependence on neural network parameters, marking a theoretical advance over prior results that required ERM. The results substantiate that diffusion models can achieve arbitrarily small distributional discrepancy with practical sample sizes, and they outline directions for extending guarantees to conditional generation and related setups.

Abstract

. Our approach leverages a structured decomposition of the score estimation error into statistical, approximation, and optimization errors, enabling us to eliminate the exponential dependence on neural network parameters that arises in prior analyses. It is the first such result that achieves sample complexity bounds without assuming access to the empirical risk minimizer of score function estimation loss.

Improved Sample Complexity For Diffusion Model Training Without Empirical Risk Minimizer Access

TL;DR

Abstract

Improved Sample Complexity For Diffusion Model Training Without Empirical Risk Minimizer Access

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (29)