Table of Contents
Fetching ...

Adaptive Importance Tempering: A flexible approach to improve computational efficiency of Metropolis Coupled Markov Chain Monte Carlo algorithms on binary spaces

Alexander Valencia-Sanchez, Jeffrey S. Rosenthal, Yasuhiro Watanabe, Hirotaka Tamura, Ali Sheikholeslami

TL;DR

This work presents two equivalent versions of the adaptive algorithm (A-IIT and SS-IIT) and establishes that both have the same limiting distribution, making either suitable for use within a parallel tempering framework.

Abstract

Based on the algorithm Informed Importance Tempering (IIT) proposed by Li et al. (2023) we propose an algorithm that uses an adaptive bounded balancing function. We argue why implementing parallel tempering where each replica uses a rejection free MCMC algorithm can be inefficient in high dimensional spaces and show how the proposed adaptive algorithm can overcome these computational inefficiencies. We present two equivalent versions of the adaptive algorithm (A-IIT and SS-IIT) and establish that both have the same limiting distribution, making either suitable for use within a parallel tempering framework. To evaluate performance, we benchmark the adaptive algorithm against several MCMC methods: IIT, Rejection free Metropolis-Hastings (RF-MH) and RF-MH using a multiplicity list. Simulation results demonstrate that Adaptive IIT identifies high-probability states more efficiently than these competing algorithms in high-dimensional binary spaces with multiple modes.

Adaptive Importance Tempering: A flexible approach to improve computational efficiency of Metropolis Coupled Markov Chain Monte Carlo algorithms on binary spaces

TL;DR

This work presents two equivalent versions of the adaptive algorithm (A-IIT and SS-IIT) and establishes that both have the same limiting distribution, making either suitable for use within a parallel tempering framework.

Abstract

Based on the algorithm Informed Importance Tempering (IIT) proposed by Li et al. (2023) we propose an algorithm that uses an adaptive bounded balancing function. We argue why implementing parallel tempering where each replica uses a rejection free MCMC algorithm can be inefficient in high dimensional spaces and show how the proposed adaptive algorithm can overcome these computational inefficiencies. We present two equivalent versions of the adaptive algorithm (A-IIT and SS-IIT) and establish that both have the same limiting distribution, making either suitable for use within a parallel tempering framework. To evaluate performance, we benchmark the adaptive algorithm against several MCMC methods: IIT, Rejection free Metropolis-Hastings (RF-MH) and RF-MH using a multiplicity list. Simulation results demonstrate that Adaptive IIT identifies high-probability states more efficiently than these competing algorithms in high-dimensional binary spaces with multiple modes.
Paper Structure (25 sections, 3 theorems, 18 equations, 11 figures, 18 tables, 7 algorithms)

This paper contains 25 sections, 3 theorems, 18 equations, 11 figures, 18 tables, 7 algorithms.

Key Result

Proposition 1

Let $\pi$ be the target probability distribution, $\gamma\geq1$ be fixed and consider the Markov chain kernel $P_{\gamma}$ defined in equation eq:markov_kernel_adaptive where $\mathcal{Q}$ is a proposal distribution that makes the kernel irreducible and aperiodic and $h_{\gamma}$ is a balancing func

Figures (11)

  • Figure 1: Comparison of balancing functions. Showing the possible values that the expression $h\left(\frac{\pi(y)}{\pi(x)}\right)$ may take for 2 neighbors of $x$: $y_1$ with lower probability and $y_2$ with higher probability.
  • Figure 2: Number of Rejection Free iterations for replicas tempered at different temperatures
  • Figure 3: Bounded balancing function based on $h(r)=\sqrt{r}$ considering different values for the bounding constant $\gamma$.
  • Figure 4: Comparison of performance of the four algorithms in the low dimensional bimodal problems. On the left, time to visit all the modes of the target distribution. On the right, evolution of Total Variation Distance
  • Figure 5: Comparison of performance of the four algorithms in the low dimensional multimodal problem. On the left, time to visit all the modes of the target distribution. On the right, evolution of Total Variation Distance
  • ...and 6 more figures

Theorems & Definitions (6)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof