Table of Contents
Fetching ...

Simulation and inference methods for non-Markovian stochastic biochemical reaction networks

Thomas P. Steele, David J. Warne

TL;DR

This paper tackles forward and inverse problems for non-Markovian stochastic biochemical reaction networks that include delays dependent on system state and time. It develops a general non-Markovian framework and extends the Next Reaction Method and tau-leaping to arbitrary inter-event time distributions, plus a coupling scheme to generate correlated exact and approximate paths. This enables effective multifidelity simulation-based inference, demonstrated via a gene regulation model with delayed auto-inhibition, achieving substantial gains in both simulation accuracy and inference efficiency. The work broadens the practical applicability of non-Markovian models in systems biology and related fields.

Abstract

Stochastic models of biochemical reaction networks are widely used to capture intrinsic noise in cellular systems. The typical formulation of these models are based on Markov processes for which there is extensive research on efficient simulation and inference. However, there are biological processes, such as gene transcription and translation, that introduce history dependent dynamics requiring non-Markovian processes to accurately capture the stochastic dynamics of the system. This greater realism comes with additional computational challenges for simulation and parameter inference. We develop efficient stochastic simulation algorithms for well-mixed non-Markovian stochastic biochemical reaction networks with delays that depend on system state and time. Our methods generalize the next reaction method and $τ$-leaping method to support arbitrary inter-event time distributions while preserving computational scalability. We also introduce a coupling scheme to generate exact non-Markovian sample paths that are positively correlated to an approximate non-Markovian $τ$-leaping sample path. This enables substantial computational gains for Bayesian inference of model parameters though multifidelity simulation-based inference schemes. We demonstrate the effectiveness of our approach on a gene regulation model with delayed auto-inhibition, showing substantial gains in both simulation accuracy and inference efficiency of two orders of magnitude. These results extend the practical applicability of non-Markovian models in systems biology and beyond.

Simulation and inference methods for non-Markovian stochastic biochemical reaction networks

TL;DR

This paper tackles forward and inverse problems for non-Markovian stochastic biochemical reaction networks that include delays dependent on system state and time. It develops a general non-Markovian framework and extends the Next Reaction Method and tau-leaping to arbitrary inter-event time distributions, plus a coupling scheme to generate correlated exact and approximate paths. This enables effective multifidelity simulation-based inference, demonstrated via a gene regulation model with delayed auto-inhibition, achieving substantial gains in both simulation accuracy and inference efficiency. The work broadens the practical applicability of non-Markovian models in systems biology and related fields.

Abstract

Stochastic models of biochemical reaction networks are widely used to capture intrinsic noise in cellular systems. The typical formulation of these models are based on Markov processes for which there is extensive research on efficient simulation and inference. However, there are biological processes, such as gene transcription and translation, that introduce history dependent dynamics requiring non-Markovian processes to accurately capture the stochastic dynamics of the system. This greater realism comes with additional computational challenges for simulation and parameter inference. We develop efficient stochastic simulation algorithms for well-mixed non-Markovian stochastic biochemical reaction networks with delays that depend on system state and time. Our methods generalize the next reaction method and -leaping method to support arbitrary inter-event time distributions while preserving computational scalability. We also introduce a coupling scheme to generate exact non-Markovian sample paths that are positively correlated to an approximate non-Markovian -leaping sample path. This enables substantial computational gains for Bayesian inference of model parameters though multifidelity simulation-based inference schemes. We demonstrate the effectiveness of our approach on a gene regulation model with delayed auto-inhibition, showing substantial gains in both simulation accuracy and inference efficiency of two orders of magnitude. These results extend the practical applicability of non-Markovian models in systems biology and beyond.

Paper Structure

This paper contains 19 sections, 47 equations, 6 figures, 6 algorithms.

Figures (6)

  • Figure 1: Example realisations of the gene regulation with delayed autoinhibition model (Section \ref{['subsec: 3.1 gene regulation with delayed autoinhibition']}) using the exact nM-NRM (a)--(b) and approximate nM-TLM with time step $\tau = 2.5$ (c)--(d). $n = 4$ independent paths are shown for (a) nM-NRM and (c) nM-TLM. Path distributions are shown using $n =100$ realisations for (b) nM-NRM and (d) nM-TLM. Model parameters are $\beta_m = 10$, $\beta_{m^*} = 0.175$, $\beta_p = 1$, $\gamma_m = 0.08$, $\gamma_p = 0.05$, $\alpha = 2.5$, $M_a = 10$, $v = 0.5$, $P_a = 5$, and $h = 1.5$.
  • Figure 2: Demonstration of the coupling scheme (Section \ref{['subsec: 2.3.3 cor alg']}, Algorithms \ref{['alg: complete PoisProc']} and \ref{['alg: nm-nrm-coupled']}) for nM-NRM realisations (solid lines) that are correlated to a given nM-TLm realisation (dashed lines). Each panel shows $n = 10$ independent nM-NRM simulations that are: (a) uncoupled and independent of the nM-TLM simulation with $\tau =2.5$; (b) coupled and correlated to the nM-TLM simulation with $\tau =2.5$; (c) uncoupled and independent of the nM-TLM simulation with $\tau =0.05$; and (d) coupled and correlated to the nM-TLM simulation with $\tau=0.05$. Model parameters are $\beta_m = 10$, $\beta_{m^*} = 0.175$, $\beta_p = 1$, $\gamma_m = 0.08$, $\gamma_p = 0.05$, $\alpha = 2.5$, $M_a = 10$, $v = 0.5$, $P_a = 5$, and $h = 1.5$.
  • Figure 3: Empirical convergence of the nM-TLM under weak error (green diamonds) and strong error (orange circles). Lines corresponding to a weak convergence rate of $\mathcal{O}(\tau)$ (green dashed line) and a strong convergence rate of $\mathcal{O}(\tau^{1/2})$ (orange dashed line) are show for reference.
  • Figure 4: Synthetic data generated from an exact realisation of the gene transcription with delayed auto-inhibition model (Section \ref{['subsec: 3.1 gene regulation with delayed autoinhibition']}). The full sample path for mRNA $M(t)$ (green line), proteins $P(t)$ (orange line), and incomplete transcriptions $M^*(t)$ (purple line) are shown along with the discrete observations contributing to the data $y_{obs}$ (black crosses). Here the model parameters are $\beta_m = 10$, $\beta_{m^*} = 0.175$, $\beta_p = 1$, $\gamma_m = 0.08$, $\gamma_p =0.05$, $\alpha = 2.5$, $M_a = 10$, $v = 0.5$, $h = 1.5$, and $P_a = 5$. The initial condition is $M(t_0) = P(t_0) = M^*(t_0) = 0$.
  • Figure 5: Comparison of convergence rate of posterior mean estimator variance as a function of computational cost for: (a) mRNA transcription initiation rate $\beta_m$; (b) protein translation rate $\beta_p$; (c) transcription completion time shape parameter $\alpha$. Results are shown for standard ABC methods (green diamonds), MF-ABC with $\tau = 0.68$ (orange circles), and MF-ABC with $\tau = 2.31$ (purple squares). Regressions lines are also shown (dashed lines). The ABC discrepancy threshold used is $\varepsilon = 460$.
  • ...and 1 more figures