Table of Contents
Fetching ...

ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo

Chirag Modi

TL;DR

This work develops a strategy to locally adapt the step size parameter of HMC at every iteration by evaluating a low-rank approximation of the local Hessian and estimating its largest eigenvalue, resulting in an adaptive sampler, ATLAS, which is more robust to the tuning of hyperparamters.

Abstract

Hamiltonian Monte-Carlo (HMC) and its auto-tuned variant, the No U-Turn Sampler (NUTS) can struggle to accurately sample distributions with complex geometries, e.g., varying curvature, due to their constant step size for leapfrog integration and fixed mass matrix. In this work, we develop a strategy to locally adapt the step size parameter of HMC at every iteration by evaluating a low-rank approximation of the local Hessian and estimating its largest eigenvalue. We combine it with a strategy to similarly adapt the trajectory length by monitoring the no U-turn condition, resulting in an adaptive sampler, ATLAS: adapting trajectory length and step-size. We further use a delayed rejection framework for making multiple proposals that improves the computational efficiency of ATLAS, and develop an approach for automatically tuning its hyperparameters during warmup. We compare ATLAS with state-of-the-art samplers like NUTS on a suite of synthetic and real world examples, and show that i) unlike NUTS, ATLAS is able to accurately sample difficult distributions with complex geometries, ii) it is computationally competitive to NUTS for simpler distributions, and iii) it is more robust to the tuning of hyperparamters.

ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo

TL;DR

This work develops a strategy to locally adapt the step size parameter of HMC at every iteration by evaluating a low-rank approximation of the local Hessian and estimating its largest eigenvalue, resulting in an adaptive sampler, ATLAS, which is more robust to the tuning of hyperparamters.

Abstract

Hamiltonian Monte-Carlo (HMC) and its auto-tuned variant, the No U-Turn Sampler (NUTS) can struggle to accurately sample distributions with complex geometries, e.g., varying curvature, due to their constant step size for leapfrog integration and fixed mass matrix. In this work, we develop a strategy to locally adapt the step size parameter of HMC at every iteration by evaluating a low-rank approximation of the local Hessian and estimating its largest eigenvalue. We combine it with a strategy to similarly adapt the trajectory length by monitoring the no U-turn condition, resulting in an adaptive sampler, ATLAS: adapting trajectory length and step-size. We further use a delayed rejection framework for making multiple proposals that improves the computational efficiency of ATLAS, and develop an approach for automatically tuning its hyperparameters during warmup. We compare ATLAS with state-of-the-art samplers like NUTS on a suite of synthetic and real world examples, and show that i) unlike NUTS, ATLAS is able to accurately sample difficult distributions with complex geometries, ii) it is computationally competitive to NUTS for simpler distributions, and iii) it is more robust to the tuning of hyperparamters.

Paper Structure

This paper contains 36 sections, 1 theorem, 19 equations, 8 figures, 6 algorithms.

Key Result

Lemma 1

Let $F$ be a volume-preserving involution with parameters $\epsilon$ and $n$ and $\pi$ be the target density. Then MH with the deterministic proposal kernel $q_F$ given by eq:gen-kernel, with acceptance probability $\alpha$ obeying has detailed balance with respect to $\pi$.

Figures (8)

  • Figure 1: NUTS with constant step-size can fail to accurately sample distributions with varying curvature like Neal's funnel (left) and Rosenbrock (right).
  • Figure 2: DR-HMC schematic adapted from Modi_2023 with current state $x$, proposed state $x"$, previously rejected proposals $x'$, and ghost proposal $g$.
  • Figure 3: (Left) One iteration for NoUT sampler from the initial point $x^{(0)}$, and proposal $x^{(n)}$. (Right) A sub u-turn: reverse trajectory (orange) ends before reaching the initial point $x^{(0)}$, leading to rejection.
  • Figure 4: Models with complex geometry, e.g., variants of Neal's funnel and Rosenbrock distribution. NUTS with fixed step size fails to correctly sample these distributions, but ATLAS with step size adaptation is able to.
  • Figure 5: Normalized RMSE for baseline models where NUTS is accurate. Third row shows the cost in terms of number of gradient evaluations normalized against NUTS. ATLAS is computationally competitive to NUTS.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Lemma 1