Table of Contents
Fetching ...

Optimizing for the Shortest Path in Denoising Diffusion Model

Ping Chen, Xingpeng Zhang, Zhaoxiang Liu, Huan Hu, Xiang Liu, Kai Wang, Min Wang, Yanlin Qian, Shiguo Lian

TL;DR

This work addresses the computational bottleneck of diffusion-based generative models by reframing denoising as a shortest-path problem over a reverse-step graph. ShortDF optimizes initial residuals and propagates them through a residual-path relaxation, using a multi-state training setup to stabilize learning and edge-weight estimation. The method achieves substantial inference-speedups (reducing steps from around $1000$ to below $20$) while preserving or improving sample fidelity, as demonstrated on CIFAR-10, CelebA, and LSUN Churches with favorable FID scores and speed metrics. This graph-theoretic approach enables efficient, high-quality diffusion suitable for interactive or real-time applications, and provides a foundation for end-to-end optimization of diffusion samplers and generators.

Abstract

In this research, we propose a novel denoising diffusion model based on shortest-path modeling that optimizes residual propagation to enhance both denoising efficiency and quality. Drawing on Denoising Diffusion Implicit Models (DDIM) and insights from graph theory, our model, termed the Shortest Path Diffusion Model (ShortDF), treats the denoising process as a shortest-path problem aimed at minimizing reconstruction error. By optimizing the initial residuals, we improve the efficiency of the reverse diffusion process and the quality of the generated samples. Extensive experiments on multiple standard benchmarks demonstrate that ShortDF significantly reduces diffusion time (or steps) while enhancing the visual fidelity of generated samples compared to prior arts. This work, we suppose, paves the way for interactive diffusion-based applications and establishes a foundation for rapid data generation. Code is available at https://github.com/UnicomAI/ShortDF.

Optimizing for the Shortest Path in Denoising Diffusion Model

TL;DR

This work addresses the computational bottleneck of diffusion-based generative models by reframing denoising as a shortest-path problem over a reverse-step graph. ShortDF optimizes initial residuals and propagates them through a residual-path relaxation, using a multi-state training setup to stabilize learning and edge-weight estimation. The method achieves substantial inference-speedups (reducing steps from around to below ) while preserving or improving sample fidelity, as demonstrated on CIFAR-10, CelebA, and LSUN Churches with favorable FID scores and speed metrics. This graph-theoretic approach enables efficient, high-quality diffusion suitable for interactive or real-time applications, and provides a foundation for end-to-end optimization of diffusion samplers and generators.

Abstract

In this research, we propose a novel denoising diffusion model based on shortest-path modeling that optimizes residual propagation to enhance both denoising efficiency and quality. Drawing on Denoising Diffusion Implicit Models (DDIM) and insights from graph theory, our model, termed the Shortest Path Diffusion Model (ShortDF), treats the denoising process as a shortest-path problem aimed at minimizing reconstruction error. By optimizing the initial residuals, we improve the efficiency of the reverse diffusion process and the quality of the generated samples. Extensive experiments on multiple standard benchmarks demonstrate that ShortDF significantly reduces diffusion time (or steps) while enhancing the visual fidelity of generated samples compared to prior arts. This work, we suppose, paves the way for interactive diffusion-based applications and establishes a foundation for rapid data generation. Code is available at https://github.com/UnicomAI/ShortDF.

Paper Structure

This paper contains 16 sections, 12 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Modeling each reverse step involves identifying a data node $x_t$ that minimizes cumulative transition costs $dist(x_t,t)$ (akin to a shortest path $P_i$ in a weighted graph). By relaxing the strict 'straight-line' analogy to consider minimal-cost paths in the diffusion graph, we treat the initial residual $|R(t,0)|$ as an alternative path candidate. This relaxation is embedded in the loss function, enabling the reverse graph to iteratively refine the residual toward a shorter path (e.g., collapsing $x_0\xrightarrow{}x_k\xrightarrow{}x_t$ into $x_0\xrightarrow{}x_t$), thus optimizing the reconstruction path through dynamic path compression.
  • Figure 2: Sample images generated on the CIFAR-10 dataset at 1, 5, and 10 steps.
  • Figure 3: Performance comparison of our method and DDIM on the CelebA dataset at each time node along the same sampling path (with a total of 20 nodes), following the optimization of the initial residuals.
  • Figure 4: High-quality CelebA image samples generated using our method with only 10 diffusion steps.
  • Figure 5: Denoising quality of our method with DDIM on the Church dataset at each time node along the same sampling path, following the optimization of the initial residuals.
  • ...and 1 more figures