Quantifying the Speed-Up from Non-Reversibility in MCMC Tempering Algorithms
Gareth O. Roberts, Jeffrey S. Rosenthal
TL;DR
This work quantifies the gain from non-reversibility in MCMC tempering by linking momentum-like updates to a simple double-birth-death chain and deriving a diffusion-limit description. By analyzing space scaling with a factor $\ell$ and defining an efficiency function $\text{eff}(\ell)$, the authors derive optimal scaling strategies and show that, under a strong theoretical framework for tempering, non-reversible tempering surpasses reversible tempering by about a factor of 1.42 in maximum efficiency, with a practical 42% improvement at optimal scaling. The study combines analytic results (diffusion limits, scaling laws, and explicit formulas for reversible vs non-reversible efficiency) with simulations in $d=100$ that corroborate the theoretical curves and round-trip rates. The findings inform how to choose temperature spacings and momentum-like updates to maximize round-trip efficiency, while highlighting that non-reversibility yields moderate gains rather than a dramatic overhaul of tempering methods.
Abstract
We investigate the increase in efficiency of simulated and parallel tempering MCMC algorithms when using non-reversible updates to give them "momentum". By making a connection to a certain simple discrete Markov chain, we show that, under appropriate assumptions, the non-reversible algorithms still exhibit diffusive behaviour, just on a different time scale. We use this to argue that the optimally scaled versions of the non-reversible algorithms are indeed more efficient than the optimally scaled versions of their traditional reversible counterparts, but only by a modest speed-up factor of about 42%.
