Table of Contents
Fetching ...

Simultaneous identification of models and parameters of scientific simulators

Cornelius Schröder, Jakob H. Macke

TL;DR

This work addresses the challenge of identifying both the component structure and parameters of compositional scientific simulators under likelihood-free settings. It introduces Simulation-Based Model Inference (SBMI), which jointly infers $p(M|x)$ and $p(\theta|M,x)$ using amortized neural networks: a conditional mixture of Grassmann distributions (MoGr) for model posteriors and a marginalized Gaussian Mixture Density Network for parameter posteriors, with a graph-based model prior guiding component inclusion. The method is demonstrated on additive, drift-diffusion, and Hodgkin-Huxley models, revealing multiple data-consistent configurations, exposing non-identifiable components, and delivering calibrated predictive posteriors and interpretable interactions between model components. These results enable data-driven, uncertainty-aware comparisons over model compositions and support principled domain knowledge integration in complex scientific modeling. SBMI’s amortized framework facilitates rapid inference on new data and can be extended to varying-output simulators and symbolic regression-like analyses, with potential broad impact across sciences where modular, interacting mechanisms underlie observed phenomena.

Abstract

Many scientific models are composed of multiple discrete components, and scientists often make heuristic decisions about which components to include. Bayesian inference provides a mathematical framework for systematically selecting model components, but defining prior distributions over model components and developing associated inference schemes has been challenging. We approach this problem in a simulation-based inference framework: We define model priors over candidate components and, from model simulations, train neural networks to infer joint probability distributions over both model components and associated parameters. Our method, simulation-based model inference (SBMI), represents distributions over model components as a conditional mixture of multivariate binary distributions in the Grassmann formalism. SBMI can be applied to any compositional stochastic simulator without requiring likelihood evaluations. We evaluate SBMI on a simple time series model and on two scientific models from neuroscience, and show that it can discover multiple data-consistent model configurations, and that it reveals non-identifiable model components and parameters. SBMI provides a powerful tool for data-driven scientific inquiry which will allow scientists to identify essential model components and make uncertainty-informed modelling decisions.

Simultaneous identification of models and parameters of scientific simulators

TL;DR

This work addresses the challenge of identifying both the component structure and parameters of compositional scientific simulators under likelihood-free settings. It introduces Simulation-Based Model Inference (SBMI), which jointly infers and using amortized neural networks: a conditional mixture of Grassmann distributions (MoGr) for model posteriors and a marginalized Gaussian Mixture Density Network for parameter posteriors, with a graph-based model prior guiding component inclusion. The method is demonstrated on additive, drift-diffusion, and Hodgkin-Huxley models, revealing multiple data-consistent configurations, exposing non-identifiable components, and delivering calibrated predictive posteriors and interpretable interactions between model components. These results enable data-driven, uncertainty-aware comparisons over model compositions and support principled domain knowledge integration in complex scientific modeling. SBMI’s amortized framework facilitates rapid inference on new data and can be extended to varying-output simulators and symbolic regression-like analyses, with potential broad impact across sciences where modular, interacting mechanisms underlie observed phenomena.

Abstract

Many scientific models are composed of multiple discrete components, and scientists often make heuristic decisions about which components to include. Bayesian inference provides a mathematical framework for systematically selecting model components, but defining prior distributions over model components and developing associated inference schemes has been challenging. We approach this problem in a simulation-based inference framework: We define model priors over candidate components and, from model simulations, train neural networks to infer joint probability distributions over both model components and associated parameters. Our method, simulation-based model inference (SBMI), represents distributions over model components as a conditional mixture of multivariate binary distributions in the Grassmann formalism. SBMI can be applied to any compositional stochastic simulator without requiring likelihood evaluations. We evaluate SBMI on a simple time series model and on two scientific models from neuroscience, and show that it can discover multiple data-consistent model configurations, and that it reveals non-identifiable model components and parameters. SBMI provides a powerful tool for data-driven scientific inquiry which will allow scientists to identify essential model components and make uncertainty-informed modelling decisions.
Paper Structure (46 sections, 1 theorem, 23 equations, 18 figures, 15 tables, 2 algorithms)

This paper contains 46 sections, 1 theorem, 23 equations, 18 figures, 15 tables, 2 algorithms.

Key Result

Proposition A6.1

Optimizing the SBMI loss function $\mathcal{L}(\psi, \phi) = - \frac{1}{L} \sum_{l} \mathcal{L}_{M_l}(\psi) + \mathcal{L}_{\theta_l}(\phi)$ minimizes the expected Kullback-Leibler divergence between the true joint posterior $p(M,\theta|x)$ and the approximation $q_\phi(M|x)q_\psi(\theta|M,x)$:

Figures (18)

  • Figure 1: Simulation-based model inference (SBMI) scheme. (a) The model prior $p(M)$ is given implicitly by a graph. A random walk from the start to the end node corresponds to a draw from this prior. (b) We first sample from the model prior and the corresponding parameter priors $p(\theta_i)$ to compile a forward model. Following this sampling procedure, we generate training data with which we can learn a approximation of the joint posterior $p(M,\theta|x)$ by factorizing the posterior into $p(M|x) p(\theta|M,x)$. Finally we can evaluate this posterior for some observed data $x_o$.
  • Figure 2: SBMI network architecture. Data $x$ is passed through an embedding net (EN). The embedded data $e$ is forwarded to the model posterior network (MPN), which learns posteriors over different model components, and the parameter posterior network (PPN) which learns the posterior distributions over parameters given specific models $M$. Gray boxes correspond to network inputs / outputs.
  • Figure 3: Additive model. (a) Model prior represented as a graph, the width of the edges corresponds to their initial weights, which change dynamically. A random walk from start (S) to end node (E) corresponds to one draw from the prior. Four prior samples are shown. (b) Empirical prior distribution, reference and SBMI posterior distribution for one example observation, generated by the model highlighted by the red dashed line. The model vectors are shown as binary image, black indicating the presence of the specific model component. SBMI accurately recovers the posterior over model components. Marginal distributions in Fig. \ref{['fig:app:additive']}. (c) One- and two-dimensional marginals of the parameter posterior inferred with SBMI, conditioned on the 'true model' (red dotted line in (b)). Note the strongly negatively correlated (degenerate) posterior between the redundant model components $l_1$ and $l_2$. Parameter posteriors for additional models in Fig. \ref{['fig:app:additive']}. (d) Predictive samples for an observation $x_o$ from $f_\text{gt}$. Blue: Mean $\pm$ std. as local uncertainties of the posterior predictives $x \sim p(x|\theta, M)$ with $\theta \sim p(\theta|M,x_o)$.
  • Figure 4: SBMI performance for the additive model.(a) KL divergence of the SBMI model posterior $q_\psi(M|x_o)$ to the reference posterior $\hat{p}_\text{ref}(M|x_o)$ for 100 observations $x_o$. (b) Posterior predictive performance in terms of RMSE between 1k observations $x_o$ and posterior samples $x_l$. Red line indicates RMSE between $x_o$ and samples from the ground truth (gt) model as lower bound. (a) and (b) show mean and std. for 5 training runs and different numbers of training samples (from 5k to 500k). (c) Histogram of the c2st ranks for the additive models with six components with a c2st mode of 0.54 (0.48/0.61 as .05/.95 percentiles). A value of 0.5 indicates a well calibrated posterior for which the rank statistics are indistinguishable from a uniform distribution. (d) Same as (c) for the additive model with eleven components with a c2st mode of 0.52 (0.43/0.62, 'all') and 0.53 (0.48/0.61, '100 most likely'). See also Fig. \ref{['fig:app:SBC_additive']} for individual SBC plots.
  • Figure 5: SBMI on Drift-Diffusion Models. (a) A decision process is modelled by a one-dimensional stochastic process. A binary decision is taken once the process hits the upper or lower boundary, resulting in a two-dimensional output (a continuous decision time and a binary decision). (b) The model prior is a graph consisting of two drift ($d_c$, $d_l$) and two boundary ($b_c$, $b_{exp}$) components, as well as a non-decision time ($ndt$). (c) Example parameter posterior inferred with SBMI for which both, the ground truth model and the predicted model, have leaky drift and exponentially collapsing boundary conditions. (d) Posterior predictives with local uncertainties as mean $\pm$ std. for the two most likely models (dark blue with $q_{\psi}(M|x_o)=0.75$ and light blue with $q_{\psi}(M|x_o)=0.25$).
  • ...and 13 more figures

Theorems & Definitions (2)

  • Proposition A6.1
  • proof