Statistical guarantees for denoising reflected diffusion models
Asbjørn Holk, Claudia Strauch, Lukas Trottner
TL;DR
This work analyzes denoising reflected diffusion models (DRDMs) on bounded domains, establishing minimax-optimal convergence rates in total variation up to polylog factors under Sobolev smoothness. It leverages a space-time spectral decomposition of the reflected diffusion generator and a neural-network-based score estimator to control the score-matching error, yielding an explicit rate of $n^{-rac{s}{2s+d}}$ (up to logs) for the generated density. The authors propose a three-step approximation framework—spectral truncation, NN approximation of the truncated score, and time interpolation—to construct a tractable score estimator with provable guarantees. The results illuminate statistical guarantees for diffusion-based generative models under reflection constraints and provide a foundation for extending such guarantees to structured data and bounded-state-space settings.
Abstract
In recent years, denoising diffusion models have become a crucial area of research due to their abundance in the rapidly expanding field of generative AI. While recent statistical advances have delivered explanations for the generation ability of idealised denoising diffusion models for high-dimensional target data, implementations introduce thresholding procedures for the generating process to overcome issues arising from the unbounded state space of such models. This mismatch between theoretical design and implementation of diffusion models has been addressed empirically by using a \emph{reflected} diffusion process as the driver of noise instead. In this paper, we study statistical guarantees of these denoising reflected diffusion models. In particular, we establish minimax optimal rates of convergence in total variation, up to a polylogarithmic factor, under Sobolev smoothness assumptions. Our main contributions include the statistical analysis of this novel class of denoising reflected diffusion models and a refined score approximation method in both time and space, leveraging spectral decomposition and rigorous neural network analysis.
