Table of Contents
Fetching ...

Discrete diffusion samplers and bridges: Off-policy algorithms and applications in latent spaces

Arran Carter, Sanghyeok Choi, Kirill Tamogashev, Víctor Elvira, Nikolay Malkin

TL;DR

The paper tackles sampling from discrete energy-based targets $p_{ ext{target}}(x)=\frac{1}{Z}e^{-\mathcal{E}(x)}$ where $Z$ is unknown, by introducing off-policy training for discrete diffusion samplers and extending the framework to discrete data-to-energy Schrödinger bridges. It develops forward/backward kernel formulations, second-moment trajectory objectives, and variable-time discretisation, then enhances training with replay buffers, importance weighting, and MCMC exploration. A key contribution is generalising Schrödinger bridges to cases where one or both distributions are specified by an energy function, enabling data-to-energy and energy-to-energy bridging via IPF-like updates. The approach is validated on Ising/Potts models and discretised synthetic densities, showing improved mode coverage and sampling quality, and is extended to outsourced posterior sampling in discrete latent spaces (e.g., VQ-VAE on MNIST). Together, these results advance efficient sampling and conditional inference in discrete spaces and open avenues for data-free distribution alignment and latent-space posterior tasks in generative modeling.

Abstract

Sampling from a distribution $p(x) \propto e^{-\mathcal{E}(x)}$ known up to a normalising constant is an important and challenging problem in statistics. Recent years have seen the rise of a new family of amortised sampling algorithms, commonly referred to as diffusion samplers, that enable fast and efficient sampling from an unnormalised density. Such algorithms have been widely studied for continuous-space sampling tasks; however, their application to problems in discrete space remains largely unexplored. Although some progress has been made in this area, discrete diffusion samplers do not take full advantage of ideas commonly used for continuous-space sampling. In this paper, we propose to bridge this gap by introducing off-policy training techniques for discrete diffusion samplers. We show that these techniques improve the performance of discrete samplers on both established and new synthetic benchmarks. Next, we generalise discrete diffusion samplers to the task of bridging between two arbitrary distributions, introducing data-to-energy Schrödinger bridge training for the discrete domain for the first time. Lastly, we showcase the application of the proposed diffusion samplers to data-free posterior sampling in the discrete latent spaces of image generative models.

Discrete diffusion samplers and bridges: Off-policy algorithms and applications in latent spaces

TL;DR

The paper tackles sampling from discrete energy-based targets where is unknown, by introducing off-policy training for discrete diffusion samplers and extending the framework to discrete data-to-energy Schrödinger bridges. It develops forward/backward kernel formulations, second-moment trajectory objectives, and variable-time discretisation, then enhances training with replay buffers, importance weighting, and MCMC exploration. A key contribution is generalising Schrödinger bridges to cases where one or both distributions are specified by an energy function, enabling data-to-energy and energy-to-energy bridging via IPF-like updates. The approach is validated on Ising/Potts models and discretised synthetic densities, showing improved mode coverage and sampling quality, and is extended to outsourced posterior sampling in discrete latent spaces (e.g., VQ-VAE on MNIST). Together, these results advance efficient sampling and conditional inference in discrete spaces and open avenues for data-free distribution alignment and latent-space posterior tasks in generative modeling.

Abstract

Sampling from a distribution known up to a normalising constant is an important and challenging problem in statistics. Recent years have seen the rise of a new family of amortised sampling algorithms, commonly referred to as diffusion samplers, that enable fast and efficient sampling from an unnormalised density. Such algorithms have been widely studied for continuous-space sampling tasks; however, their application to problems in discrete space remains largely unexplored. Although some progress has been made in this area, discrete diffusion samplers do not take full advantage of ideas commonly used for continuous-space sampling. In this paper, we propose to bridge this gap by introducing off-policy training techniques for discrete diffusion samplers. We show that these techniques improve the performance of discrete samplers on both established and new synthetic benchmarks. Next, we generalise discrete diffusion samplers to the task of bridging between two arbitrary distributions, introducing data-to-energy Schrödinger bridge training for the discrete domain for the first time. Lastly, we showcase the application of the proposed diffusion samplers to data-free posterior sampling in the discrete latent spaces of image generative models.
Paper Structure (63 sections, 24 equations, 13 figures, 8 tables, 2 algorithms)

This paper contains 63 sections, 24 equations, 13 figures, 8 tables, 2 algorithms.

Figures (13)

  • Figure 1: Trajectories sampled from a learnt approximation to the discrete Schrödinger bridge between a mixture of three Gaussians (left, given by samples), and a mixture of two Gaussians (right, given by an unnormalised density), both quantised and represented as 6-bit binary Gray codes. Background shading represents the marginal densities at each step.
  • Figure 2: Samples generated by each method for the 16 $\times$ 16 Potts model with $\beta=1.005$ (top) and $\beta=1.2$ (bottom), $q=3$.
  • Figure 3: Visualisation of samples generated by each method for the discretised 40GMM ($d=4\times8=32$), projected to the first two dimensions. The crosses are placed at the 40 modes of the mixture. Only the off-policy TB + Buffer + MCMC discovers all 40 modes.
  • Figure 4: Comparison of discrete-space Schrödinger bridges on 16-dimensional binary Gray-coded spatial data learnt by data-to-energy IPF with on-policy and off-policy LV training.
  • Figure 5: Comparison between decoded samples from the true posterior and from diffusion samplers (trained on- and off-policy) in the latent space of a VQ-VAE trained on MNIST, showing that outsourced discrete diffusion samplers successfully allow conditioning by learning to sample in latent space.
  • ...and 8 more figures