Large deviation-based tuning schemes for Metropolis-Hastings algorithms
Federica Milinanni
TL;DR
The paper tackles tuning Metropolis-Hastings algorithms through a large deviation framework for the empirical measure. It develops an alternative dual representation of the MH rate function and derives practical upper and lower bounds to quantify convergence speed without exact rate-function computation. Using these bounds, it proposes three large-deviation-based tuning schemes to identify near-optimal MH hyperparameters, demonstrated on an Independent MH example where tuning aligns the proposal with the target. The work provides a principled method for pre-calibrating MH-type algorithms and paves the way for applying these ideas to more advanced MH variants like MALA and HMC.
Abstract
Markov chain Monte Carlo (MCMC) methods are one of the most popular classes of algorithms for sampling from a target probability distribution. A rising trend in recent years consists in analyzing the convergence of MCMC algorithms using tools from the theory of large deviations. In (Milinanni & Nyquist, 2024), a new framework based on this approach has been developed to study the convergence of empirical measures associated with algorithms of Metropolis-Hastings type, a broad and popular sub-class of MCMC methods. The goal of this paper is to leverage these large deviation results to improve the efficiency of Metropolis-Hastings algorithms. Specifically, we use the large deviations rate function (a central object in large deviation theory) to quantify and characterize the algorithms' speed of convergence. We begin by extending the analysis from (Milinanni & Nyquist, 2024), deriving alternative representations of the rate function. Building on this, we establish explicit upper and lower bounds, which we then use to design schemes to tune Metropolis-Hastings algorithms.
