Table of Contents
Fetching ...

Space-Time Diffusion Bridge

Hamidreza Behjoo, Michael Chertkov

TL;DR

This paper introduces space-time diffusion bridges as a theoretical and practical framework to generate i.i.d. samples from high-dimensional GT-implied distributions. It decomposes the process into affine space-time diffusion, diffusion-bridge conditioning, and nonlinear score-based refinements, with forward (non-denoising) and reverse-time (denoising) variants. Key contributions include a closed-form DB for affine drift, GT-aligned nonlinear extensions via score matching, ELBO-based optimization for affine parameters, and a GT-driven training paradigm. Empirical validation on MNIST and CIFAR-10 demonstrates competitive, scalable sampling performance and highlights potential for simulation-free inference and extension to high-resolution data.

Abstract

In this study, we introduce a novel method for generating new synthetic samples that are independent and identically distributed (i.i.d.) from high-dimensional real-valued probability distributions, as defined implicitly by a set of Ground Truth (GT) samples. Central to our method is the integration of space-time mixing strategies that extend across temporal and spatial dimensions. Our methodology is underpinned by three interrelated stochastic processes designed to enable optimal transport from an easily tractable initial probability distribution to the target distribution represented by the GT samples: (a) linear processes incorporating space-time mixing that yield Gaussian conditional probability densities, (b) their diffusion bridge analogs that are conditioned to the initial and final state vectors, and (c) nonlinear stochastic processes refined through score-matching techniques. The crux of our training regime involves fine-tuning the nonlinear model, and potentially the linear models -- to align closely with the GT data. We validate the efficacy of our space-time diffusion approach with numerical experiments, laying the groundwork for more extensive future theory and experiments to fully authenticate the method, particularly providing a more efficient (possibly simulation-free) inference.

Space-Time Diffusion Bridge

TL;DR

This paper introduces space-time diffusion bridges as a theoretical and practical framework to generate i.i.d. samples from high-dimensional GT-implied distributions. It decomposes the process into affine space-time diffusion, diffusion-bridge conditioning, and nonlinear score-based refinements, with forward (non-denoising) and reverse-time (denoising) variants. Key contributions include a closed-form DB for affine drift, GT-aligned nonlinear extensions via score matching, ELBO-based optimization for affine parameters, and a GT-driven training paradigm. Empirical validation on MNIST and CIFAR-10 demonstrates competitive, scalable sampling performance and highlights potential for simulation-free inference and extension to high-resolution data.

Abstract

In this study, we introduce a novel method for generating new synthetic samples that are independent and identically distributed (i.i.d.) from high-dimensional real-valued probability distributions, as defined implicitly by a set of Ground Truth (GT) samples. Central to our method is the integration of space-time mixing strategies that extend across temporal and spatial dimensions. Our methodology is underpinned by three interrelated stochastic processes designed to enable optimal transport from an easily tractable initial probability distribution to the target distribution represented by the GT samples: (a) linear processes incorporating space-time mixing that yield Gaussian conditional probability densities, (b) their diffusion bridge analogs that are conditioned to the initial and final state vectors, and (c) nonlinear stochastic processes refined through score-matching techniques. The crux of our training regime involves fine-tuning the nonlinear model, and potentially the linear models -- to align closely with the GT data. We validate the efficacy of our space-time diffusion approach with numerical experiments, laying the groundwork for more extensive future theory and experiments to fully authenticate the method, particularly providing a more efficient (possibly simulation-free) inference.
Paper Structure (16 sections, 28 equations, 5 figures, 1 table)

This paper contains 16 sections, 28 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Comparison of FID scores between forward time and reverse time generative models employing the Brownian Bridge scheme, characterized by $\bar{\bm{A}}(t) = \bm{I}/(1-t)$ and $\bm{\kappa}(t) = \bm{I}$. Notably, with 1000 discretization steps, both models achieve an FID score of approximately 2, indicating a high degree of similarity in image quality and diversity to the real data distribution at this level of discretization.
  • Figure 2: FID scores comparison for space-time DB models employing forward and reverse time approaches. This illustration underscores the models' performance in generative tasks, as elaborated in the accompanying text. Notably, with 1000 discretization steps, the FID score approximates 1.5, highlighting the nuanced efficacy of these models in capturing the data's underlying distribution with a high degree of fidelity.
  • Figure 3: This visualization showcases the model's generative capability to synthesize high-fidelity images by integrating the space-time DB model within the reverse time framework with 1000 discretization steps.
  • Figure 4: The first row shows the mean of the space-time diffusion bridge, and the second row shows the amount of noise added at each time step. The third column shows the combination of blurring and noise. All results are shown for $t \in [\varepsilon, 1-\varepsilon]$.
  • Figure 5: Samples of generated images from the space-time diffusion bridge.