Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Models

Sohail Reddy; Hillary Fairbanks

Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Models

Sohail Reddy, Hillary Fairbanks

TL;DR

The paper addresses the computational bottleneck of Bayesian inverse problems with expensive forward maps by introducing a surrogate-augmented multilevel MCMC (MLMCMC) that leverages a geometric multigrid hierarchy and a low-fidelity machine learning model on the coarsest level. A two-stage coarse-level MH step uses the MLM for cheap proposals, followed by a PDE-based filter to control approximation error, with theoretical guarantees of detailed balance and consistency. A four-level Darcy flow test demonstrates that this approach achieves roughly a 2x speedup over the PDE-only hierarchy while preserving posterior accuracy, aided by a CNN surrogate trained on coarse-level data and coupled across levels via MLDA. The work provides practical guidance on surrogate accuracy requirements and highlights robustness across levels, suggesting broad applicability to large-scale Bayesian inference problems in subsurface flows and other PDE-driven systems.

Abstract

This work presents an efficient approach for accelerating multilevel Markov Chain Monte Carlo (MCMC) sampling for large-scale problems using low-fidelity machine learning models. While conventional techniques for large-scale Bayesian inference often substitute computationally expensive high-fidelity models with machine learning models, thereby introducing approximation errors, our approach offers a computationally efficient alternative by augmenting high-fidelity models with low-fidelity ones within a hierarchical framework. The multilevel approach utilizes the low-fidelity machine learning model (MLM) for inexpensive evaluation of proposed samples thereby improving the acceptance of samples by the high-fidelity model. The hierarchy in our multilevel algorithm is derived from geometric multigrid hierarchy. We utilize an MLM to acclerate the coarse level sampling. Training machine learning model for the coarsest level significantly reduces the computational cost associated with generating training data and training the model. We present an MCMC algorithm to accelerate the coarsest level sampling using MLM and account for the approximation error introduced. We provide theoretical proofs of detailed balance and demonstrate that our multilevel approach constitutes a consistent MCMC algorithm. Additionally, we derive conditions on the accuracy of the machine learning model to facilitate more efficient hierarchical sampling. Our technique is demonstrated on a standard benchmark inference problem in groundwater flow, where we estimate the probability density of a quantity of interest using a four-level MCMC algorithm. Our proposed algorithm accelerates multilevel sampling by a factor of two while achieving similar accuracy compared to sampling using the standard multilevel algorithm.

Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Models

TL;DR

Abstract

Paper Structure (12 sections, 2 theorems, 35 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 12 sections, 2 theorems, 35 equations, 4 figures, 1 table, 2 algorithms.

Introduction
Bayesian Inverse Problem: Markov Chain Monte Carlo
Hierarchical Sampling of Gaussian Random Fields
Multi-Level Markov Chain Monte Carlo
Subsurface Flow: Darcy's Equations
Results
Problem Formulation
Bayesian Inference in Subsurface Flow
Conclusion
Appendix
Integrated Autocorrelation Time
Number of Effective Samples

Key Result

Proposition 1

Algorithm Alg:FilterMCMC simulates a Markov chain that is in detailed balance with $\pi(\cdot)$

Figures (4)

Figure 1: The convergence in mean-squared error (MSE) for training and validation set for the training epochs.
Figure 2: Analysis of chains on each level of a four-level hierarchy showing: (a) the mean in $Y_\ell$, (b) the variance in $Y_\ell$, (c) the effective sample size, and (d) the acceptance ratio. (e) The speed-up relative to the reference for a hierarchy with different number of levels. The green $\times$ represents the acceptance of the complete two-stage algorithm, whereas green $\mathop{\mathrm{\scalerel*{\cdot}{\bigodot}}}\limits$ represents the acceptance rate of only the second stage.
Figure 3: The Wasserstein distance (a,d) between the reference posterior distribution and those obtained using ML-augmented MCMC on each level, the posterior distributions on the coarsest level (b,e) and the finest level (c,f) for $Q$ (top) and $Y$ (bottom).
Figure 4: Autocorrelation estimates of quantity of interest $Q$ (top) and corresponding $Y$ (bottom) on different levels for increasing lag times. In Fig. \ref{['fig:AutoC:Q:L0']}, the solid line represent the estimate for only the second stage, while the dashed line represents the estimate of the complete two-stage algorithm.

Theorems & Definitions (8)

Remark 1
Proposition 1
proof
Remark 2
Lemma 1
proof
Remark 3
Remark 4

Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Models

TL;DR

Abstract

Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (8)