Table of Contents
Fetching ...

Neural Methods for Amortized Inference

Andrew Zammit-Mangion, Matthew Sainsbury-Dale, Raphaël Huser

TL;DR

Recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation is reviewed in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation.

Abstract

Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimization libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortized, in the sense that, after an initial setup cost, they allow rapid inference through fast feed-forward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortized inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.

Neural Methods for Amortized Inference

TL;DR

Recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation is reviewed in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation.

Abstract

Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimization libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortized, in the sense that, after an initial setup cost, they allow rapid inference through fast feed-forward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortized inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.
Paper Structure (40 sections, 37 equations, 11 figures, 2 tables)

This paper contains 40 sections, 37 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: (Left) A hypothetical nonnegative function $g(X, \delta)$. To find the optimal decision $\delta^*$ for a given $X$ (shown as red dots for $X =0$ and $X = 1.5$), $g(X, {\delta})$ must be minimized along a slice at $X$ (black dotted lines). (Centre) A function $\delta^*(\cdot)$ (red solid line) that minimizes $g(\cdot, \cdot)$ for any $X \in \mathcal{X}$, whose existence is proved by Brown_1973, and alternative decision rules (orange dashed lines). If known, or well-approximated, in closed form, $\delta^*(\cdot)$ can be used to quickly make optimal decisions for any $X \in \mathcal{X}$. (Right) The optimal decision rule $\delta^*(\cdot)$ satisfies $g\{X, \delta^*(X)\} = \inf_\delta g(X,\delta)$ for all $X \in \mathcal{X}$, and therefore minimizes $\int_{\mathcal{X}}g\{X,\delta(X)\}\textrm{d}\mu(X)$ under any strictly positive measure $\mu(\cdot)$ on $\mathcal{X}$.
  • Figure 2: Graphical representation of the neural Bayes estimator that takes data as input and that outputs a parameter point estimate.
  • Figure 3: Graphical representation of an inference network that takes data as input and that outputs parameters of an approximate posterior distribution.
  • Figure 4: Graphical representation depicting a summary network that takes data as input and that outputs summary statistics, and an inference network that takes the summary statistics as input and that outputs parameters of an approximate posterior distribution.
  • Figure 5: Graphical representation of a neural binding function, a neural network that takes parameters as input and that outputs binding function parameters used to construct a synthetic likelihood function.
  • ...and 6 more figures