Table of Contents
Fetching ...

Simulation-based inference for stochastic nonlinear mixed-effects models with applications in systems biology

Henrik Häggström, Sebastian Persson, Marija Cvijovic, Umberto Picchini

TL;DR

A novel methodology for scalable Bayesian inference in hierarchical mixed-effects models that can accommodate both stochastic and deterministic models is proposed and is demonstrated to be both fast and competitive in terms of statistical accuracy.

Abstract

The analysis of data from multiple experiments, such as observations of several individuals, is commonly approached using mixed-effects models, which account for variation between individuals through hierarchical representations. This makes mixed-effects models widely applied in fields such as biology, pharmacokinetics, and sociology. In this work, we propose a novel methodology for scalable Bayesian inference in hierarchical mixed-effects models. Our framework first constructs amortized approximations of the likelihood and the posterior distribution, which are then rapidly refined for each individual dataset, to ultimately approximate the parameters posterior across many individuals. The framework is easily trainable, as it uses mixtures of experts but without neural networks, leading to parsimonious yet expressive surrogate models of the likelihood and the posterior. We demonstrate the effectiveness of our methodology using challenging stochastic models, such as mixed-effects stochastic differential equations emerging in systems biology-driven problems. However, the approach is broadly applicable and can accommodate both stochastic and deterministic models. We show that our approach can seamlessly handle inference for many parameters. Additionally, we applied our method to a real-data case study of mRNA transfection. When compared to exact pseudomarginal Bayesian inference, our approach proved to be both fast and competitive in terms of statistical accuracy.

Simulation-based inference for stochastic nonlinear mixed-effects models with applications in systems biology

TL;DR

A novel methodology for scalable Bayesian inference in hierarchical mixed-effects models that can accommodate both stochastic and deterministic models is proposed and is demonstrated to be both fast and competitive in terms of statistical accuracy.

Abstract

The analysis of data from multiple experiments, such as observations of several individuals, is commonly approached using mixed-effects models, which account for variation between individuals through hierarchical representations. This makes mixed-effects models widely applied in fields such as biology, pharmacokinetics, and sociology. In this work, we propose a novel methodology for scalable Bayesian inference in hierarchical mixed-effects models. Our framework first constructs amortized approximations of the likelihood and the posterior distribution, which are then rapidly refined for each individual dataset, to ultimately approximate the parameters posterior across many individuals. The framework is easily trainable, as it uses mixtures of experts but without neural networks, leading to parsimonious yet expressive surrogate models of the likelihood and the posterior. We demonstrate the effectiveness of our methodology using challenging stochastic models, such as mixed-effects stochastic differential equations emerging in systems biology-driven problems. However, the approach is broadly applicable and can accommodate both stochastic and deterministic models. We show that our approach can seamlessly handle inference for many parameters. Additionally, we applied our method to a real-data case study of mRNA transfection. When compared to exact pseudomarginal Bayesian inference, our approach proved to be both fast and competitive in terms of statistical accuracy.

Paper Structure

This paper contains 29 sections, 34 equations, 23 figures, 4 tables.

Figures (23)

  • Figure 1: Bayesian network model structure for a SDE mixed-effects model.
  • Figure 2: Ornstein-Uhlenbeck: marginal posteriors from 10k posterior samples from MCMC using the exact likelihood (purple) and round $r=4$ of SeMPLE (orange). Priors are in green. The dashed lines are the true parameter values.
  • Figure 3: Ornstein-Uhlenbeck: Posterior-predictive simulations from SeMPLE ($r=4$) and data (colored lines, 40 individuals). In grey is the area between the 2.5th and 97.5th percentile from 10k posterior-predictive simulations obtained from SeMPLE.
  • Figure 4: mRNA model with 40 simulated individuals: marginal posteriors obtained with SeMPLE (orange, round $r=4$) and with PEPSDI (purple). Priors are in green. The dashed lines are the true parameter values that were used to generate the observed data.
  • Figure 5: mRNA model with real data: posterior-predictive simulations for 40 individuals using SeMPLE ($r=4$, colored lines are observed data). In grey is the area between the 2.5th and 97.5th percentile from $1,000$ posterior-predictive simulations obtained from SeMPLE.
  • ...and 18 more figures