Improved off-policy training of diffusion samplers
Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin
TL;DR
The paper tackles sampling from unnormalized densities using diffusion models and continuous GFlowNets, introducing a unified diffusion-structured sampler library and a novel local-search replay-buffer exploration to boost sample quality. It provides a thorough comparison of diffusion-based and off-policy methods, showing that simple exploration boosts performance, while Langevin-type inductive biases improve credit assignment, and that local search substantially mitigates mode collapse. Key contributions include empirical benchmarks across diverse densities, analysis of credit-assignment strategies, and a practical off-policy exploration method with demonstrated gains. The work advances amortized inference with diffusion samplers and offers ready-to-use code to promote reproducibility and future research into efficient high-dimensional sampling and latent-variable inference.
Abstract
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.
