On Error Propagation of Diffusion Models
Yangming Li, Mihaela van der Schaar
TL;DR
This work identifies and analyzes error propagation in diffusion models, showing that their chain structure can amplify intermediate errors. It introduces a formal framework with modular error, cumulative error, and a propagation equation governed by an amplification factor $\mu_t$, and proves that standard DMs exhibit propagation tendencies ($\mu_t \ge 1$) under reasonable assumptions. To mitigate propagation, the authors propose a tractable regularization term derived from an upper bound on the cumulative error, along with a bootstrap-based estimation algorithm inspired by TD learning, enabling efficient training. Empirical results on CIFAR-10, CelebA, and ImageNet demonstrate substantial reductions in error propagation, improved FID scores, and superiority over baselines, validating both the theory and the practical utility of the proposed regularization. The work provides a principled approach to diagnose and reduce exposure-bias-like effects in diffusion models, with direct implications for more reliable and higher-quality image synthesis.
Abstract
Although diffusion models (DMs) have shown promising performances in a number of tasks (e.g., speech synthesis and image generation), they might suffer from error propagation because of their sequential structure. However, this is not certain because some sequential models, such as Conditional Random Field (CRF), are free from this problem. To address this issue, we develop a theoretical framework to mathematically formulate error propagation in the architecture of DMs, The framework contains three elements, including modular error, cumulative error, and propagation equation. The modular and cumulative errors are related by the equation, which interprets that DMs are indeed affected by error propagation. Our theoretical study also suggests that the cumulative error is closely related to the generation quality of DMs. Based on this finding, we apply the cumulative error as a regularization term to reduce error propagation. Because the term is computationally intractable, we derive its upper bound and design a bootstrap algorithm to efficiently estimate the bound for optimization. We have conducted extensive experiments on multiple image datasets, showing that our proposed regularization reduces error propagation, significantly improves vanilla DMs, and outperforms previous baselines.
