Table of Contents
Fetching ...

Conditional sampling within generative diffusion models

Zheng Zhao, Ziwei Luo, Jens Sjölund, Thomas B. Schön

TL;DR

This survey addresses the problem of sampling $π(·|y)$ using generative diffusion models, focusing on two practical data-access regimes: joint data samples $π_{X,Y}$ and likelihood-based access to the marginal $π_X$. It synthesizes three families of conditional samplers: (i) joint-bridging approaches that train a diffusion directly for $π(·|y)$, (ii) filtering-based methods that operate on the joint distribution via forward-backward path concepts, and (iii) Feynman–Kac-based methods that use an explicit likelihood and a pre-trained marginal to form a diffusion/posterior via sequential Monte Carlo. The article details Anderson’s and Schrödinger-bridge foundations, discusses the practicalities, biases, and computational trade-offs of each approach, and provides a pedagogical example comparing their performance. The work highlights opportunities for diagnostic tools, handling outliers, and further integrating deep learning advances to robustify conditional diffusion samplers in complex inverse problems.

Abstract

Generative diffusions are a powerful class of Monte Carlo samplers that leverage bridging Markov processes to approximate complex, high-dimensional distributions, such as those found in image processing and language models. Despite their success in these domains, an important open challenge remains: extending these techniques to sample from conditional distributions, as required in, for example, Bayesian inverse problems. In this paper, we present a comprehensive review of existing computational approaches to conditional sampling within generative diffusion models. Specifically, we highlight key methodologies that either utilise the joint distribution, or rely on (pre-trained) marginal distributions with explicit likelihoods, to construct conditional generative samplers.

Conditional sampling within generative diffusion models

TL;DR

This survey addresses the problem of sampling using generative diffusion models, focusing on two practical data-access regimes: joint data samples and likelihood-based access to the marginal . It synthesizes three families of conditional samplers: (i) joint-bridging approaches that train a diffusion directly for , (ii) filtering-based methods that operate on the joint distribution via forward-backward path concepts, and (iii) Feynman–Kac-based methods that use an explicit likelihood and a pre-trained marginal to form a diffusion/posterior via sequential Monte Carlo. The article details Anderson’s and Schrödinger-bridge foundations, discusses the practicalities, biases, and computational trade-offs of each approach, and provides a pedagogical example comparing their performance. The work highlights opportunities for diagnostic tools, handling outliers, and further integrating deep learning advances to robustify conditional diffusion samplers in complex inverse problems.

Abstract

Generative diffusions are a powerful class of Monte Carlo samplers that leverage bridging Markov processes to approximate complex, high-dimensional distributions, such as those found in image processing and language models. Despite their success in these domains, an important open challenge remains: extending these techniques to sample from conditional distributions, as required in, for example, Bayesian inverse problems. In this paper, we present a comprehensive review of existing computational approaches to conditional sampling within generative diffusion models. Specifically, we highlight key methodologies that either utilise the joint distribution, or rely on (pre-trained) marginal distributions with explicit likelihoods, to construct conditional generative samplers.
Paper Structure (14 sections, 26 equations, 2 figures, 4 algorithms)

This paper contains 14 sections, 26 equations, 2 figures, 4 algorithms.

Figures (2)

  • Figure 1: Illustration of conditional samples (in blue scatters) with three conditions on $y$, where the contour plots the true conditional density function. Note that for each method we draw 10,000 samples, but we downsample it to 2,000 for visibility. We see that all the three methods recover the true conditional distribution but with small biases. For instance, CDSB with $y=5$ has biased samples between the two modes.
  • Figure 2: The marginal histograms with condition $y=5$ (corresponding to the last row in Figure \ref{['fig:crescent']}). We see that although the CDSB and Feynman--Kac methods recover the shape of the distribution, they are biased (e.g., note CDSB at around $x_1 = 0$ and Feynman--Kac at around $x_2 = -2$).