Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

Yanbo Wang; Wenyu Chen; Shimin Shan

Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

Yanbo Wang, Wenyu Chen, Shimin Shan

TL;DR

This work addresses the efficiency of Gibbs sampling by rethinking the scan order. It introduces a weighted, non-uniform random-scan Gibbs sampler that preserves the target posterior while increasing mixing by prioritizing updates of high-variance coordinates. The authors derive an analytic weight rule, $q_i=\frac{\sqrt{d_i}}{\sum_j \sqrt{d_j}}$ with $d_i=2\mathrm{Var}_{\pi_i}(x_i)$, and provide an empirical method to estimate $\hat{d}_i$ with regularization, enabling practical deployment. Across Gaussian, grid-lattice MRF, and LDA experiments, the method accelerates convergence, especially when variances are heterogeneous, highlighting its potential for scalable inference in large models.

Abstract

Gibbs sampling is one of the most commonly used Markov Chain Monte Carlo (MCMC) algorithms due to its simplicity and efficiency. It cycles through the latent variables, sampling each one from its distribution conditional on the current values of all the other variables. Conventional Gibbs sampling is based on the systematic scan (with a deterministic order of variables). In contrast, in recent years, Gibbs sampling with random scan has shown its advantage in some scenarios. However, almost all the analyses of Gibbs sampling with the random scan are based on uniform selection of variables. In this paper, we focus on a random scan Gibbs sampling method that selects each latent variable non-uniformly. Firstly, we show that this non-uniform scan Gibbs sampling leaves the target posterior distribution invariant. Then we explore how to determine the selection probability for latent variables. In particular, we construct an objective as a function of the selection probability and solve the constrained optimization problem. We further derive an analytic solution of the selection probability, which can be estimated easily. Our algorithm relies on the simple intuition that choosing the variable updates according to their marginal probabilities enhances the mixing time of the Markov chain. Finally, we validate the effectiveness of the proposed Gibbs sampler by conducting a set of experiments on real-world applications.

Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

TL;DR

with

, and provide an empirical method to estimate

with regularization, enabling practical deployment. Across Gaussian, grid-lattice MRF, and LDA experiments, the method accelerates convergence, especially when variances are heterogeneous, highlighting its potential for scalable inference in large models.

Abstract

Paper Structure (19 sections, 2 theorems, 18 equations, 9 figures, 3 algorithms)

This paper contains 19 sections, 2 theorems, 18 equations, 9 figures, 3 algorithms.

Introduction
Related Work
Gibbs Sampling with Accelerated Convergence
Non-uniform Sampling in Optimization
Weighted Gibbs Sampler
Non-uniform Scan Gibbs Sampling
Proof Sketch
Maximize the Effective Sample Size
Analytic Form of the Sampling Weights
Proof Sketch
Intuition
Estimate $\hat{d}_i$
Experiments
Gaussian Distribution with Synthetic Data
Grid-lattice MRF for Image Denoising
...and 4 more sections

Key Result

Theorem 1

The stationary distribution of non-uniform scans is the expected target distribution.

Figures (9)

Figure 1: Synthetic data experiment on a Gaussian distribution. For the homogeneous case, the gaps between all three methods are small. The weighted Gibbs sampler outperforms baseline methods clearly in the heterogeneous case.
Figure 2: Visualization of sampling steps using different sampling methods from the synthetic data experiment. Since the experiment is highly dimensional, the first two principal components are employed for this visualization using PCA. It shows that the burn-in period is shorter using the proposed weighted Gibbs sampling.
Figure 3: An exemplar case of image denoising using Grid-Lattice Markov Random Field. In this example, the noise variance is set to $\sigma = 1.0$.
Figure 4: Evaluation of the image denoising task using Grid-Lattice Markov Random Field. Panel (a)-(d) show relative $L_2$ reconstruction errors as a function of the number of draws for different variances. Then we display the weight distribution for different $\sigma$ in panel (e). In panel (f), the weights of different pixels are reported. We also conduct the same procedure to other three images in the dataset and report the results in supplementary materials.
Figure 5: Synthetic data examples. Panel (b) shows a subset of the generated images (documents), where each image is the results of 100 samples from a unique mixture of these topics shown in panel (a).
...and 4 more figures

Theorems & Definitions (2)

Theorem 1
Lemma 1

Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

TL;DR

Abstract

Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (2)