Table of Contents
Fetching ...

Compositional amortized inference for large-scale hierarchical Bayesian models

Jonas Arruda, Vikas Pandey, Catherine Sherry, Margarida Barroso, Xavier Intes, Jan Hasenauer, Stefan T. Radev

TL;DR

A new error-damping estimator is developed to address previous stability issues of CSM when aggregating large numbers of data points and achieves competitive performance to direct ABI baselines on smaller problem sizes while using less than one full model simulation for larger problem sizes.

Abstract

Amortized Bayesian inference (ABI) with neural networks has emerged as a powerful simulation-based approach for estimating complex mechanistic models. However, extending ABI to hierarchical models, a cornerstone of modern Bayesian analysis, has been a major hurdle due to the need to simulate and process massive datasets. Our study tackles these challenges by extending compositional score matching (CSM), a divide-and-conquer strategy for Bayesian updating using diffusion models. We develop a new error-damping estimator to address previous stability issues of CSM when aggregating large numbers of data points. We first verified the numerical stability with up to 100,000 data points on a controlled benchmark. We then evaluated our method on a hierarchical AR model, achieving competitive performance to direct ABI baselines on smaller problem sizes while using less than one full model simulation for larger problem sizes. Finally, we address a large-scale inverse problem in advanced microscopy with over 750,000 parameters, demonstrating its relevance to real scientific applications.

Compositional amortized inference for large-scale hierarchical Bayesian models

TL;DR

A new error-damping estimator is developed to address previous stability issues of CSM when aggregating large numbers of data points and achieves competitive performance to direct ABI baselines on smaller problem sizes while using less than one full model simulation for larger problem sizes.

Abstract

Amortized Bayesian inference (ABI) with neural networks has emerged as a powerful simulation-based approach for estimating complex mechanistic models. However, extending ABI to hierarchical models, a cornerstone of modern Bayesian analysis, has been a major hurdle due to the need to simulate and process massive datasets. Our study tackles these challenges by extending compositional score matching (CSM), a divide-and-conquer strategy for Bayesian updating using diffusion models. We develop a new error-damping estimator to address previous stability issues of CSM when aggregating large numbers of data points. We first verified the numerical stability with up to 100,000 data points on a controlled benchmark. We then evaluated our method on a hierarchical AR model, achieving competitive performance to direct ABI baselines on smaller problem sizes while using less than one full model simulation for larger problem sizes. Finally, we address a large-scale inverse problem in advanced microscopy with over 750,000 parameters, demonstrating its relevance to real scientific applications.

Paper Structure

This paper contains 52 sections, 3 theorems, 42 equations, 12 figures, 3 tables.

Key Result

Proposition 3.1

The mini-batch estimator (Eq. eq:mini-batch) is an unbiased estimator of the compositional score.

Figures (12)

  • Figure 1: Compositional inference for hierarchical Bayesian models. Overview of our training procedure (left) and inference stages (right) for amortized hierarchical Bayesian modeling. Amortized posterior sampling uses our error-damping compositional score estimator to achieve rapid inference on very high-dimensional hierarchical problems.
  • Figure 2: Evaluation of the error-damping estimator for the Gaussian toy example. Different evaluation metrics are shown for different dataset sizes and damping factors $d_1$ or cosine shifts $s$. The mini-batch size was set to 10% of the dataset size, and for each step, 10 runs were performed. The median and median absolute deviation are reported, besides for those runs in which none converged.
  • Figure 3: Assessing inference for high-resolution grids ($128 {\times} 128$).A Global parameter recovery across 100 datasets, showing the posterior median and median absolute deviation. B Posterior calibration plot for the global parameters using SBC sailynoja2022graphical.
  • Figure 4: Inference for fluorescence lifetime imaging.A Mean intensity across time for each pixel, representing the fluorescence data. B Time series data and fitted posterior median for representative pixels. C Spatial map of the fitted local posteriors (medians) per pixel. D Spatial map of $R^2$ for each pixel, comparing our results with a flat Bayesian model and a popular baseline (MLE).
  • Figure 5: Assessing the adaptive sampling scheme for compositional inference in the toy model. (a) Increasing numbers of sampling steps are needed for increasing number of subsets of groups. (b) The adaptive step size is adaptively increased towards the end of the sampling (low noise region).
  • ...and 7 more figures

Theorems & Definitions (5)

  • Proposition 3.1
  • Proposition A.1
  • proof
  • Proposition A.2
  • proof