Markov Chain Monte Carlo and Variational Inference: Bridging the Gap
Tim Salimans, Diederik P. Kingma, Max Welling
TL;DR
The paper addresses intractable Bayesian posteriors by blending variational inference with MCMC through auxiliary variables, enabling a flexible posterior approximation that can be trained via stochastic gradients. It introduces the auxiliary variational lower bound $\mathcal{L}_{aux}$ and develops Markov Chain Variational Inference (MCVI) and Hamiltonian Variational Inference (HVI), integrating MCMC steps such as Gibbs sampling, over-relaxation, and Hamiltonian dynamics into the variational framework. It analyzes practical chain specifications, including detailed balance, annealed inference, multiple iterates, and sequential MCVI, showing how these choices tighten bounds and improve convergence. The approach yields tighter posteriors and faster optimization in experiments spanning simple Gaussian targets to deep generative models on MNIST, highlighting its potential to bridge the speed of VI with the accuracy of MCMC for scalable Bayesian inference.
Abstract
Recent advances in stochastic gradient variational inference have made it possible to perform variational Bayesian inference with posterior approximations containing auxiliary random variables. This enables us to explore a new synthesis of variational inference and Monte Carlo methods where we incorporate one or more steps of MCMC into our variational approximation. By doing so we obtain a rich class of inference algorithms bridging the gap between variational methods and MCMC, and offering the best of both worlds: fast posterior approximation through the maximization of an explicit objective, with the option of trading off additional computation for additional accuracy. We describe the theoretical foundations that make this possible and show some promising first results.
