Table of Contents
Fetching ...

Improving sampling efficacy on high dimensional distributions with thin high density regions using Conservative Hamiltonian Monte Carlo

Geoffrey McGregor, Andy T. S. Wan

TL;DR

The paper tackles the deterioration of sampling efficiency in high-dimensional distributions caused by energy errors in Hamiltonian Monte Carlo (HMC), especially when density concentrates in thin high-density regions. It proposes Conservative Hamiltonian Monte Carlo (CHMC), which uses $R$-reversible energy-preserving integrators to generate proposals that stay on the same energy surface up to machine precision, and employs an approximate Jacobian determinant in the Metropolis step to achieve approximate stationarity with a controllable energy error $\|\epsilon\|_\infty = \mathcal{O}(\tau^p)$. Through experiments on $p$-generalized $\chi$ and $p$-generalized Gaussian distributions, CHMC demonstrates improved convergence and robustness to integration parameters across high dimensions, outperforming HMC with Leapfrog in terms of KS and Wasserstein distances. The work highlights practical gradient-free variants and outlines future directions, including adaptive step-size schemes, alternative energy-preserving integrators, and a formal convergence theory, with code and supplementary materials available for replication.

Abstract

Hamiltonian Monte Carlo is a prominent Markov Chain Monte Carlo algorithm, which employs symplectic integrators to sample from high dimensional target distributions in many applications, such as statistical mechanics, Bayesian statistics and generative models. However, such distributions tend to have thin high density regions, posing a significant challenge for symplectic integrators to maintain the small energy errors needed for a high acceptance probability. Instead, we propose a variant called Conservative Hamiltonian Monte Carlo, using $R$--reversible energy-preserving integrators to retain a high acceptance probability. We show our algorithm can achieve approximate stationarity with an error determined by the Jacobian approximation of the energy-preserving proposal map. Numerical evidence shows improved convergence and robustness over integration parameters on target distributions with thin high density regions and in high dimensions. Moreover, a version of our algorithm can also be applied to target distributions without gradient information.

Improving sampling efficacy on high dimensional distributions with thin high density regions using Conservative Hamiltonian Monte Carlo

TL;DR

The paper tackles the deterioration of sampling efficiency in high-dimensional distributions caused by energy errors in Hamiltonian Monte Carlo (HMC), especially when density concentrates in thin high-density regions. It proposes Conservative Hamiltonian Monte Carlo (CHMC), which uses -reversible energy-preserving integrators to generate proposals that stay on the same energy surface up to machine precision, and employs an approximate Jacobian determinant in the Metropolis step to achieve approximate stationarity with a controllable energy error . Through experiments on -generalized and -generalized Gaussian distributions, CHMC demonstrates improved convergence and robustness to integration parameters across high dimensions, outperforming HMC with Leapfrog in terms of KS and Wasserstein distances. The work highlights practical gradient-free variants and outlines future directions, including adaptive step-size schemes, alternative energy-preserving integrators, and a formal convergence theory, with code and supplementary materials available for replication.

Abstract

Hamiltonian Monte Carlo is a prominent Markov Chain Monte Carlo algorithm, which employs symplectic integrators to sample from high dimensional target distributions in many applications, such as statistical mechanics, Bayesian statistics and generative models. However, such distributions tend to have thin high density regions, posing a significant challenge for symplectic integrators to maintain the small energy errors needed for a high acceptance probability. Instead, we propose a variant called Conservative Hamiltonian Monte Carlo, using --reversible energy-preserving integrators to retain a high acceptance probability. We show our algorithm can achieve approximate stationarity with an error determined by the Jacobian approximation of the energy-preserving proposal map. Numerical evidence shows improved convergence and robustness over integration parameters on target distributions with thin high density regions and in high dimensions. Moreover, a version of our algorithm can also be applied to target distributions without gradient information.
Paper Structure (4 sections, 2 theorems, 2 equations, 4 figures, 1 algorithm)

This paper contains 4 sections, 2 theorems, 2 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

Denote $\bm z :=(\bm{q},\bm{p}) \in\mathbb{R}^{2d}$ and let $\Psi:\mathbb{R}^{2d}\rightarrow \mathbb{R}^{2d}$ be a positively-oriented (i.e. $\det J_\Psi > 0$) $C^1$-diffeomorphism, with its Jacobian matrix entries $[J_\Psi]_{ij} \in L^{\infty}(\mathbb{R}^{2d})$. Also, suppose $\Psi$ is $R$--revers with the transition kernel density from $\bm z$ to $\bm z'$ be given by $\rho({\bm z},{\bm z}')=\al

Figures (4)

  • Figure 1: Comparison on histograms and convergence of HMC--LF versus CHMC at sampling the $6$-generalized $\chi$ distribution with increasing degrees of freedom $d$.
  • Figure 2: Comparison on errors of HMC--LF versus CHMC at sampling $p$-generalized $\chi$ distribution for various $d, p$, integration parameters $T,\tau$.
  • Figure 3: Comparison on convergence of HMC--LF versus CHMC in various metrics at sampling I.I.D. 4-generalized Gaussian in high dimensions.
  • Figure 4: Comparison on histograms of $\exp(-\Delta H)$ versus $\det J_{\Psi_{EP}}$, violin plots of $\alpha$, and histogram of transformed samples (HMC--LF, CHMC--FullJ, CHMC).

Theorems & Definitions (2)

  • Theorem 1: Error bound on stationarity of $R$--reversible proposal with approximate Jacobian
  • Corollary 1: Approximate Stationarity of CHMC