Table of Contents
Fetching ...

All-in-one simulation-based inference

Manuel Gloeckler, Michael Deistler, Christian Weilbach, Frank Wood, Jakob H. Macke

TL;DR

This work presents the Simformer, a transformer-based diffusion model for amortized simulation-based inference (SBI) that learns the joint distribution $p(\boldsymbol{\theta}, \boldsymbol{x})$ and enables sampling of arbitrary conditionals, including both posterior and likelihood. By encoding variables as tokens and using tunable attention masks, it exploits dependency structure and supports function-valued/infinite-dimensional parameters as well as unstructured data. Diffusion guidance further allows conditioning on observation intervals or constraints without retraining. Across benchmarks in ecology, epidemiology, and neuroscience, the Simformer achieves higher accuracy with far fewer simulations than prior SBI methods and provides a unified framework to access multiple conditional distributions quickly and flexibly.

Abstract

Amortized Bayesian inference trains neural networks to solve stochastic inference problems using model simulations, thereby making it possible to rapidly perform Bayesian inference for any newly observed data. However, current simulation-based amortized inference methods are simulation-hungry and inflexible: They require the specification of a fixed parametric prior, simulator, and inference tasks ahead of time. Here, we present a new amortized inference method -- the Simformer -- which overcomes these limitations. By training a probabilistic diffusion model with transformer architectures, the Simformer outperforms current state-of-the-art amortized inference approaches on benchmark tasks and is substantially more flexible: It can be applied to models with function-valued parameters, it can handle inference scenarios with missing or unstructured data, and it can sample arbitrary conditionals of the joint distribution of parameters and data, including both posterior and likelihood. We showcase the performance and flexibility of the Simformer on simulators from ecology, epidemiology, and neuroscience, and demonstrate that it opens up new possibilities and application domains for amortized Bayesian inference on simulation-based models.

All-in-one simulation-based inference

TL;DR

This work presents the Simformer, a transformer-based diffusion model for amortized simulation-based inference (SBI) that learns the joint distribution and enables sampling of arbitrary conditionals, including both posterior and likelihood. By encoding variables as tokens and using tunable attention masks, it exploits dependency structure and supports function-valued/infinite-dimensional parameters as well as unstructured data. Diffusion guidance further allows conditioning on observation intervals or constraints without retraining. Across benchmarks in ecology, epidemiology, and neuroscience, the Simformer achieves higher accuracy with far fewer simulations than prior SBI methods and provides a unified framework to access multiple conditional distributions quickly and flexibly.

Abstract

Amortized Bayesian inference trains neural networks to solve stochastic inference problems using model simulations, thereby making it possible to rapidly perform Bayesian inference for any newly observed data. However, current simulation-based amortized inference methods are simulation-hungry and inflexible: They require the specification of a fixed parametric prior, simulator, and inference tasks ahead of time. Here, we present a new amortized inference method -- the Simformer -- which overcomes these limitations. By training a probabilistic diffusion model with transformer architectures, the Simformer outperforms current state-of-the-art amortized inference approaches on benchmark tasks and is substantially more flexible: It can be applied to models with function-valued parameters, it can handle inference scenarios with missing or unstructured data, and it can sample arbitrary conditionals of the joint distribution of parameters and data, including both posterior and likelihood. We showcase the performance and flexibility of the Simformer on simulators from ecology, epidemiology, and neuroscience, and demonstrate that it opens up new possibilities and application domains for amortized Bayesian inference on simulation-based models.
Paper Structure (50 sections, 27 equations, 23 figures, 1 algorithm)

This paper contains 50 sections, 27 equations, 23 figures, 1 algorithm.

Figures (23)

  • Figure 1: Capabilities of the Simformer: It can perform inference for simulators with a finite number of parameters or function-valued parameters (first column), it can exploit dependency structures of the simulator to improve accuracy (second column), it can perform inference for unstructured or missing data, for observation intervals (third column), and it provides an 'all-in-one' inference method that can sample all conditionals of the joint distribution, including posterior and likelihood (fourth column).
  • Figure 2: Simformer architecture. All variables (parameters and data) are reduced to a token representation which includes the variables' identity, the variables' value (val) as well as the conditional state (latent (L) or conditioned (C)). This sequence of tokens is processed by a transformer model; the interaction of variables can be explicitly controlled through an attention mask. The transformer architecture returns a score that is used to generate samples from the score-based diffusion model and can be modified (e.g. to guide the diffusion process).
  • Figure 3: Examples of arbitrary conditional distributions of the Two Moons simulator, estimated by the Simformer.
  • Figure 4: Simformer performance on benchmark tasks. The suffices "undirected graph" and "directed graph" denote Simformer variants with structured attention based on the respective graphical models. (a) Classifier Two-Sample Test (C2ST) accuracy between Simformer- and ground-truth posteriors. (b) C2ST between arbitrary Simformer-conditional distributions and their ground truth.
  • Figure 5: Inference with unstructured observations in the Lotka-Volterra model. (a) Posterior predictive (left) and posterior distribution (right) based on four unstructured observations of the prey population density (green crosses), using Simformer with $10^5$ simulations. True parameters in dark blue. (b) Same as (a) with nine additional observations of the predator population density. (c) C2ST-performance in estimating arbitrary conditionals (right) or the posterior distribution (left) using the C2ST metric.
  • ...and 18 more figures