DDPM Score Matching and Distribution Learning

Sinho Chewi; Alkis Kalavasis; Anay Mehrotra; Omar Montasser

DDPM Score Matching and Distribution Learning

Sinho Chewi, Alkis Kalavasis, Anay Mehrotra, Omar Montasser

TL;DR

This framework offers the first principled method to prove computational lower bounds for score estimation across general distributions, and establishes cryptographic lower bounds for score estimation in general Gaussian mixture models, conceptually recovering Song's result and advancing his key open problem.

Abstract

Score estimation is the backbone of score-based generative models (SGMs), especially denoising diffusion probabilistic models (DDPMs). A key result in this area shows that with accurate score estimates, SGMs can efficiently generate samples from any realistic data distribution (Chen et al., ICLR'23; Lee et al., ALT'23). This distribution learning result, where the learned distribution is implicitly that of the sampler's output, does not explain how score estimation relates to classical tasks of parameter and density estimation. This paper introduces a framework that reduces score estimation to these two tasks, with various implications for statistical and computational learning theory: Parameter Estimation: Koehler et al. (ICLR'23) demonstrate that a score-matching variant is statistically inefficient for the parametric estimation of multimodal densities common in practice. In contrast, we show that under mild conditions, denoising score-matching in DDPMs is asymptotically efficient. Density Estimation: By linking generation to score estimation, we lift existing score estimation guarantees to $(ε,δ)$-PAC density estimation, i.e., a function approximating the target log-density within $ε$ on all but a $δ$-fraction of the space. We provide (i) minimax rates for density estimation over Hölder classes and (ii) a quasi-polynomial PAC density estimation algorithm for the classical Gaussian location mixture model, building on and addressing an open problem from Gatmiry et al. (arXiv'24). Lower Bounds for Score Estimation: Our framework offers the first principled method to prove computational lower bounds for score estimation across general distributions. As an application, we establish cryptographic lower bounds for score estimation in general Gaussian mixture models, conceptually recovering Song's (NeurIPS'24) result and advancing his key open problem.

DDPM Score Matching and Distribution Learning

TL;DR

Abstract

DDPM Score Matching and Distribution Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (85)