Table of Contents
Fetching ...

An efficient likelihood-free Bayesian inference method based on sequential neural posterior estimation

Yifei Xiong, Xiliang Yang, Sanguo Zhang, Zhijian He

TL;DR

This paper tackles likelihood-free Bayesian inference for simulator-based models by refining sequential neural posterior estimation (SNPE-B) with adaptive calibration kernels and variance-reduction techniques. It introduces defensive sampling, multiple importance sampling with sample recycling, and an ESS-driven adaptive calibration kernel to stabilize training and improve posterior accuracy, along with a parameter-space transformation to prevent mass leakage. The proposed All-SNPE-B approach, which combines these strategies, achieves faster training and superior posterior approximations across benchmarks and a high-dimensional real-world dataset compared with SNPE-A, SNL, APT, and SMC-ABC under budget constraints. The work demonstrates the practical impact of targeted variance control and kernel adaptation for scalable, high-quality inference in likelihood-free settings, with broad applicability to other posteriors and inference frameworks.

Abstract

Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments. We also managed to demonstrate the superiority of the proposed method for a high-dimensional model with a real-world dataset.

An efficient likelihood-free Bayesian inference method based on sequential neural posterior estimation

TL;DR

This paper tackles likelihood-free Bayesian inference for simulator-based models by refining sequential neural posterior estimation (SNPE-B) with adaptive calibration kernels and variance-reduction techniques. It introduces defensive sampling, multiple importance sampling with sample recycling, and an ESS-driven adaptive calibration kernel to stabilize training and improve posterior accuracy, along with a parameter-space transformation to prevent mass leakage. The proposed All-SNPE-B approach, which combines these strategies, achieves faster training and superior posterior approximations across benchmarks and a high-dimensional real-world dataset compared with SNPE-A, SNL, APT, and SMC-ABC under budget constraints. The work demonstrates the practical impact of targeted variance control and kernel adaptation for scalable, high-quality inference in likelihood-free settings, with broad applicability to other posteriors and inference frameworks.

Abstract

Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments. We also managed to demonstrate the superiority of the proposed method for a high-dimensional model with a real-world dataset.
Paper Structure (18 sections, 2 theorems, 51 equations, 10 figures, 1 table, 2 algorithms)

This paper contains 18 sections, 2 theorems, 51 equations, 10 figures, 1 table, 2 algorithms.

Key Result

Theorem 3.1

Let $g(\theta,x)$ be any function and Assume that $h_g(x)$ and $h_{g^2}(x)$ are continuously differentiable twice with bounded second-order derivatives. As $\tau \to 0$, we have where $C=(2 \pi)^{-d / 2} 2^{-d / 2}|\Sigma|^{-1/2}$.

Figures (10)

  • Figure 1: The calibration kernel adjusts the sample weights around $x_o$, where larger points are assigned higher weights. Left plot: only the density ratio $p(\theta)/\tilde{p}(\theta)$ is used to weight each sample, without considering the calibration kernel. Middle plot: the calibration kernel with rate $\tau=1$ is incorporated to adjust the sample weights. Right plot: the calibration kernel is applied with rate $\tau=0.1$.
  • Figure 2: Simulation experiments on proposed strategies.A. Performance on M/G/1 queuing model. B. Performance on Lotka-Volterra predation model. The horizontal axis represents the round of training and the error bars represent the mean with the upper and lower quarterlies. Our proposed method exhibits better performance compared to the original method.
  • Figure 3: Our proposed strategies (All-SNPE-B and ACK-APT) versus other methods.A. Performance on the M/G/1 queuing model. B. Performance on the Lotka-Volterra model. The horizontal axis represents the round of training and the error bars represent the mean with the upper and lower quarterlies.
  • Figure 4: Approximation accuracy by SMC-ABC method against the number of simulations.A. Performance on the M/G/1 queuing model. B. Performance on the Lotka-Volterra model.
  • Figure 5: Performance of different methods on the state-space model. A. Inference on one year of Seattle rental price data ($T = 365$). B. Inference on four weeks of simulated data ($T = 28$). The horizontal axis represents the round of training and the error bars represent the mean with the upper and lower quarterlies.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Theorem 3.1
  • Theorem 3.2
  • proof
  • proof