Discrete Neural Flow Samplers with Locally Equivariant Transformer
Zijing Ou, Ruixiang Zhang, Yingzhen Li
TL;DR
Discrete Neural Flow Samplers (DNFS) address sampling from unnormalised discrete distributions by learning a CTMC rate matrix $R_t$ that transports a prior to the target along an annealing path while enforcing the Kolmogorov forward equation. To cope with the intractable partition function $Z_t$, the approach employs control variates and a coordinate-descent learning scheme, and it reduces computational cost using locally equivariant networks that implement a one-way rate matrix via a Locally Equivariant Transformer (leTF). The method is demonstrated on sampling from unnormalised distributions, training discrete energy-based models, and solving combinatorial optimisation problems, with graph-aware extensions (leGF) enabling COPs like MIS and MaxCut. DNFS provides competitive sampling quality, enables end-to-end training of EBMs with neural samplers, and offers a scalable, data-free alternative to diffusion-inspired discrete samplers, with potential for MCMC-refinement and broader discrete-domain applicability.
Abstract
Sampling from unnormalised discrete distributions is a fundamental problem across various domains. While Markov chain Monte Carlo offers a principled approach, it often suffers from slow mixing and poor convergence. In this paper, we propose Discrete Neural Flow Samplers (DNFS), a trainable and efficient framework for discrete sampling. DNFS learns the rate matrix of a continuous-time Markov chain such that the resulting dynamics satisfy the Kolmogorov equation. As this objective involves the intractable partition function, we then employ control variates to reduce the variance of its Monte Carlo estimation, leading to a coordinate descent learning algorithm. To further facilitate computational efficiency, we propose locally equivaraint Transformer, a novel parameterisation of the rate matrix that significantly improves training efficiency while preserving powerful network expressiveness. Empirically, we demonstrate the efficacy of DNFS in a wide range of applications, including sampling from unnormalised distributions, training discrete energy-based models, and solving combinatorial optimisation problems.
