MPL-HMC: A Tunable Parameterized Leapfrog Framework for Robust Hamiltonian Monte Carlo
Sourabh Bhattacharya
TL;DR
MPL‑HMC introduces tunable integration parameters $\alpha(\delta t)$ and $\beta(\delta t)$ in a parameterized leapfrog scheme to address stiffness, stability, and multimodality in Hamiltonian Monte Carlo, while preserving similar computational cost per gradient evaluation. Theoretical analysis shows approximate detailed balance and energy drift of order $\delta t^2$, with volume changes governed by $2\alpha_2+\beta_2$; empirically, damping ($\alpha_2=-0.1,\beta_2=-0.05$) yields substantial gains on stiff and hierarchical problems (e.g., Neal's funnel), anti‑damping ($\alpha_2=+0.1,\beta_2=+0.05$) improves convergence in high‑dimensional neural networks, and aggressive MPL‑HMC with extreme parameters enables full mode exploration in multimodal distributions. Case studies on Bayesian neural networks and pharmacokinetic modeling validate these gains while highlighting parameter sensitivity and the need for careful tuning. The Aggressive MPL‑HMC variant further demonstrates mode hopping and other exploration enhancements, though with weaker guarantees and stability risks. Overall, MPL‑HMC extends HMC’s applicability to difficult domains by offering interpretable, tunable dynamics without increasing gradient costs, making it a practical tool for modern Bayesian inference.
Abstract
This article introduces the Modified Parameterized Leapfrog Hamiltonian Monte Carlo (MPL-HMC) method, a novel extension of HMC addressing key limitations through tunable integration parameters $α(δt)$ and $β(δt)$, enabling controlled perturbations to Hamiltonian dynamics. Theoretical analysis demonstrates MPL-HMC maintains approximate detailed balance. Extensive empirical evaluation reveals systematic performance improvements. The damping variant ($α_2=-0.1$, $β_2=-0.05$) achieves a 14-fold increase in effective sample size for Neal's funnel and 27\% better efficiency for pharmacokinetic models. The anti-damping variant ($α_2=0.1$, $β_2=0.05$) achieves $\hat{R}=1.026$ for Bayesian neural networks versus $\hat{R}=1.981$ for standard HMC. We introduce aggressive MPL-HMC for multimodal distributions, employing extreme parameters ($α_2=8.0$--$15.0$, $β_2=5.0$--$8.0$) with enhanced sampling to achieve full mode exploration where standard methods fail. All variants maintain computational efficiency identical to standard HMC while providing systematic control over damping, exploration, stability, and accuracy. The article provides rigorous mathematical foundations, implementation specifications, parameter tuning strategies, and comprehensive performance comparisons, extending HMC's applicability to previously challenging domains.
