Table of Contents
Fetching ...

Metropolis Monte Carlo sampling: convergence, localization transition and optimality

Alexei D. Chepelianskii, Satya N. Majumdar, Hendrik Schawe, Emmanuel Trizac

TL;DR

This paper investigates convergence in Metropolis Monte Carlo sampling under a confining potential by deriving the Master equation with a jump-length scale $a$, and showing that the relaxation dynamics undergo a localization transition at a critical value $a^*$. The leading relaxation rate $\Lambda$ can be accurately estimated in the diffusion-dominated regime ($a<a^*$) by projecting the Master equation onto the lowest Fokker-Planck eigenmodes, yielding a Schrödinger-type operator and explicit relations $\lambda_n = 1 - \sigma^2 \epsilon_n$. At $a=a^*$ the leading relaxation mode merges with a singular continuum of the spectrum, and for $a>a^*$ the dynamics are governed by maximal rejection probabilities, causing the relaxation error to localize at specific points. The study extends beyond 1D to higher dimensions and interacting systems, demonstrating the robustness of the localization phenomenon and offering analytic tools to estimate optimal jump amplitudes and improve Monte Carlo efficiency.

Abstract

Among random sampling methods, Markov Chain Monte Carlo algorithms are foremost. Using a combination of analytical and numerical approaches, we study their convergence properties towards the steady state, within a random walk Metropolis scheme. Analysing the relaxation properties of some model algorithms sufficiently simple to enable analytic progress, we show that the deviations from the target steady-state distribution can feature a localization transition as a function of the characteristic length of the attempted jumps defining the random walk. While the iteration of the Monte Carlo algorithm converges to equilibrium for all choices of jump parameters, the localization transition changes drastically the asymptotic shape of the difference between the probability distribution reached after a finite number of steps of the algorithm and the target equilibrium distribution. We argue that the relaxation before and after the localisation transition is respectively limited by diffusion and rejection rates.

Metropolis Monte Carlo sampling: convergence, localization transition and optimality

TL;DR

This paper investigates convergence in Metropolis Monte Carlo sampling under a confining potential by deriving the Master equation with a jump-length scale , and showing that the relaxation dynamics undergo a localization transition at a critical value . The leading relaxation rate can be accurately estimated in the diffusion-dominated regime () by projecting the Master equation onto the lowest Fokker-Planck eigenmodes, yielding a Schrödinger-type operator and explicit relations . At the leading relaxation mode merges with a singular continuum of the spectrum, and for the dynamics are governed by maximal rejection probabilities, causing the relaxation error to localize at specific points. The study extends beyond 1D to higher dimensions and interacting systems, demonstrating the robustness of the localization phenomenon and offering analytic tools to estimate optimal jump amplitudes and improve Monte Carlo efficiency.

Abstract

Among random sampling methods, Markov Chain Monte Carlo algorithms are foremost. Using a combination of analytical and numerical approaches, we study their convergence properties towards the steady state, within a random walk Metropolis scheme. Analysing the relaxation properties of some model algorithms sufficiently simple to enable analytic progress, we show that the deviations from the target steady-state distribution can feature a localization transition as a function of the characteristic length of the attempted jumps defining the random walk. While the iteration of the Monte Carlo algorithm converges to equilibrium for all choices of jump parameters, the localization transition changes drastically the asymptotic shape of the difference between the probability distribution reached after a finite number of steps of the algorithm and the target equilibrium distribution. We argue that the relaxation before and after the localisation transition is respectively limited by diffusion and rejection rates.
Paper Structure (30 sections, 87 equations, 17 figures, 1 table)

This paper contains 30 sections, 87 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: The top panel shows the spectrum of $F_\beta$ for harmonic confinement $U(x) = x^2/2$ as a function of jump amplitude $a$, for a uniform jump distribution of range $(-a,a)$. The color code, provided on the right-hand-side, is for the Inverse Participation Ratio of the eigenvector associated to the eigenvalue displayed (see Appendix \ref{['sec:overview']}). The upper envelope of the relaxation spectrum defines $\Lambda$, see Eq. \ref{['eq:Lambda_def']}; shown by the red line, it reaches its minimum for $a=a_{\text{opt}} \simeq 3.33$. This value coincides with the threshold $a^*$ for localization. Here, $\cal N$, denoting the number of discrete relaxation modes (excluding the stationary state), is $0$ for large jumps ($a>a^*$), while ${\cal N}$ quickly grows as $a$ diminishes. Dashed lines show the bounds for the singular continuum, that appears in dark blue, see Eqs. \ref{['eq:rejection_rate']} and \ref{['eq:convergence_rate']}. The bottom panel is for the IPR associated to $\Lambda$, as a function of $a$ (same abscissa as the upper panel). The localization transition is signaled by the sharp jump at $a=a^*$. This threshold does not depend on the number of sites $N_d$, as long as $N_d$ is large enough. Here $N_d=1000$. The length unit is the thermal length, meaning the standard deviation of $P_\infty(x)$.
  • Figure 2: Scaled evolution of the error $\delta P_n(x) = P_n(x) - P_\infty(x)$ vs $x$ for different times $n$ (indicated by the color code on the right), for the same system as in Fig. \ref{['fig:SpectrumBoxExp']}, with the same choice of length unit. Initial $P_0(x) = (2 \pi)^{-1/2} \exp(-(x-1)^2/2)$. Comparison between $a = 3.6>a^*$ (main graph) and $a = 3<a^*$ (inset). Although the values of $a$ and the convergence rates are similar in the two graphs, the asymptotic errors are significantly different.
  • Figure 3: Comparison between the exact calculation presented in Appendix \ref{['sec:box']} and the numerical data for the scaling behavior of localization. Box potential confinement with $w(\eta) = 3(1+a^{-2}\eta^2)\theta(a-|\eta|)/(8 a)$. Lengths are expressed in unit of the box size $L$, convergence to the scaling function is shown for $a = 2.1$ and $P_0(x) = 2 \theta(1/2-|x|)$. Our analytical expression for the scaling function $\varphi(z)$ and the proof that ${\cal N} = 0$, for this choice of $w(\eta)$, are obtained for $a > 2 > a^* \simeq 1.79$.
  • Figure 4: Harmonic confinement with flat jump distribution. Plot of the largest non-stationary eigenvalue as function of jump amplitude $a$, obtained by 1) exact numerical diagonalization (dots) and 2) analytical approximation in the truncated Schrödinger equation basis, with increasing $N_s$, the number of anti-symmetric diffusion modes retained (see Appendix \ref{['sec:analytical']}). With $N_s=2$, the analytical predictions are already very accurate for $a < a^*$, convergence is very slow on the other side of the localization transition $a > a^*$
  • Figure 5: Convergence rate $\Lambda$ vs. jump amplitude $a$ for a harmonic confinement ($U(x) = x^2/2$) and a flat $w(\eta)$ distribution as in Eq. \ref{['eq:flatw']} (cf. Fig. 1 of the main text). Comparison between the direct Monte Carlo simulations measure (MC) and the numerical diagonalization technique. Both methods agree very well. The Monte Carlo approach needs to look at symmetric and asymmetric observables to measure the different branches of the spectrum. The observables used are provided in Eq. \ref{['eq:observables']}. For $a<a^*\simeq 3.33$, using the asymmetric observable ${\cal O}_1$ provides a very good estimate of the largest eigenvalue $\lambda_1=\Lambda$. For $a<a^*$, using the even observable ${\cal O}_2$ yields the second largest eigenvalue $\lambda_2$. For $a>a^*$, the largest of both ($\lambda$ from ${\cal O}_2)$ gives a very good estimation of $\Lambda$. The singular continuum is shown by the grey region. As in the main text, $a$ is given in units of the thermal length at equilibrium.
  • ...and 12 more figures