Discrete diffusion samplers and bridges: Off-policy algorithms and applications in latent spaces
Arran Carter, Sanghyeok Choi, Kirill Tamogashev, Víctor Elvira, Nikolay Malkin
TL;DR
The paper tackles sampling from discrete energy-based targets $p_{ ext{target}}(x)=\frac{1}{Z}e^{-\mathcal{E}(x)}$ where $Z$ is unknown, by introducing off-policy training for discrete diffusion samplers and extending the framework to discrete data-to-energy Schrödinger bridges. It develops forward/backward kernel formulations, second-moment trajectory objectives, and variable-time discretisation, then enhances training with replay buffers, importance weighting, and MCMC exploration. A key contribution is generalising Schrödinger bridges to cases where one or both distributions are specified by an energy function, enabling data-to-energy and energy-to-energy bridging via IPF-like updates. The approach is validated on Ising/Potts models and discretised synthetic densities, showing improved mode coverage and sampling quality, and is extended to outsourced posterior sampling in discrete latent spaces (e.g., VQ-VAE on MNIST). Together, these results advance efficient sampling and conditional inference in discrete spaces and open avenues for data-free distribution alignment and latent-space posterior tasks in generative modeling.
Abstract
Sampling from a distribution $p(x) \propto e^{-\mathcal{E}(x)}$ known up to a normalising constant is an important and challenging problem in statistics. Recent years have seen the rise of a new family of amortised sampling algorithms, commonly referred to as diffusion samplers, that enable fast and efficient sampling from an unnormalised density. Such algorithms have been widely studied for continuous-space sampling tasks; however, their application to problems in discrete space remains largely unexplored. Although some progress has been made in this area, discrete diffusion samplers do not take full advantage of ideas commonly used for continuous-space sampling. In this paper, we propose to bridge this gap by introducing off-policy training techniques for discrete diffusion samplers. We show that these techniques improve the performance of discrete samplers on both established and new synthetic benchmarks. Next, we generalise discrete diffusion samplers to the task of bridging between two arbitrary distributions, introducing data-to-energy Schrödinger bridge training for the discrete domain for the first time. Lastly, we showcase the application of the proposed diffusion samplers to data-free posterior sampling in the discrete latent spaces of image generative models.
