Table of Contents
Fetching ...

Inference of dynamical gene regulatory networks from single-cell data with physics informed neural networks

Maria Mircea, Diego Garlaschelli, Stefan Semrau

TL;DR

This work tackles the challenge of inferring mechanistic, predictive gene regulatory networks (GRNs) from single-cell data, where traditional correlation-based methods fall short. It proposes physics-informed neural networks (PINNs) as a framework to learn GRN parameters by enforcing the underlying differential equations that govern gene interactions and intercellular signaling. The authors demonstrate that PINNs outperform a naive feed-forward NN in parameter inference, and show that PINNs can recover GRN parameters from both time-resolved trajectories with cell communication and snapshot population data without communication, including scenarios with partial or noisy data. This approach provides a principled means to obtain mechanistic insights from single-cell measurements and offers guidance for experimental design to maximize informative data for PINN-based GRN inference.

Abstract

One of the main goals of developmental biology is to reveal the gene regulatory networks (GRNs) underlying the robust differentiation of multipotent progenitors into precisely specified cell types. Most existing methods to infer GRNs from experimental data have limited predictive power as the inferred GRNs merely reflect gene expression similarity or correlation. Here, we demonstrate, how physics-informed neural networks (PINNs) can be used to infer the parameters of predictive, dynamical GRNs that provide mechanistic understanding of biological processes. Specifically we study GRNs that exhibit bifurcation behavior and can therefore model cell differentiation. We show that PINNs outperform regular feed-forward neural networks on the parameter inference task and analyze two relevant experimental scenarios: 1. a system with cell communication for which gene expression trajectories are available and 2. snapshot measurements of a cell population in which cell communication is absent. Our analysis will inform the design of future experiments to be analyzed with PINNs and provides a starting point to explore this powerful class of neural network models further.

Inference of dynamical gene regulatory networks from single-cell data with physics informed neural networks

TL;DR

This work tackles the challenge of inferring mechanistic, predictive gene regulatory networks (GRNs) from single-cell data, where traditional correlation-based methods fall short. It proposes physics-informed neural networks (PINNs) as a framework to learn GRN parameters by enforcing the underlying differential equations that govern gene interactions and intercellular signaling. The authors demonstrate that PINNs outperform a naive feed-forward NN in parameter inference, and show that PINNs can recover GRN parameters from both time-resolved trajectories with cell communication and snapshot population data without communication, including scenarios with partial or noisy data. This approach provides a principled means to obtain mechanistic insights from single-cell measurements and offers guidance for experimental design to maximize informative data for PINN-based GRN inference.

Abstract

One of the main goals of developmental biology is to reveal the gene regulatory networks (GRNs) underlying the robust differentiation of multipotent progenitors into precisely specified cell types. Most existing methods to infer GRNs from experimental data have limited predictive power as the inferred GRNs merely reflect gene expression similarity or correlation. Here, we demonstrate, how physics-informed neural networks (PINNs) can be used to infer the parameters of predictive, dynamical GRNs that provide mechanistic understanding of biological processes. Specifically we study GRNs that exhibit bifurcation behavior and can therefore model cell differentiation. We show that PINNs outperform regular feed-forward neural networks on the parameter inference task and analyze two relevant experimental scenarios: 1. a system with cell communication for which gene expression trajectories are available and 2. snapshot measurements of a cell population in which cell communication is absent. Our analysis will inform the design of future experiments to be analyzed with PINNs and provides a starting point to explore this powerful class of neural network models further.
Paper Structure (22 sections, 7 equations, 7 figures)

This paper contains 22 sections, 7 equations, 7 figures.

Figures (7)

  • Figure 1: Gene regulatory network (GRN) inference with neural networks (NNs) First, a particular topology of the GRN is assumed. Together with the functional form of the interactions, the GRN topology defines a set of differential equations with undetermined parameter values. Next, a NN is trained on experimental or simulated time-series data. The parameters learned during training set the strength of interactions between genes. The fully determined dynamical system can then be used for predictions.
  • Figure 2: Cell communication drives bifurcations in a GRN model of differentiation.a System of differential equations corresponding to the GRN model by Stanoev et al. The mutual inhibition of the two master transcription factors $u$ and $v$ as well as the inhibition of $u$ by the signaling molecule $s$ are modeled with repressive Hill functions $H_I$. The cell autonomous activation of signalling molecule $s$ by $u$ is modeled with an activating Hill function $H_A$. $i$ is the cell index. $s_{ext}$ is the level of $s$ averaged over cell $i$ and its neighbors (typically nearest neighbors, unless otherwise indicated by the edges in panel b). The degradation rate for $u$, $v$ and $s$ is assumed to be identical, and time was rescaled with the inverse degradation rate, so that the rate does not appear explicitly in the equations. b Studied configurations of cells. Edges indicate cell communication. c Results for the 2-cell configuration. Several bifurcations are driven by the parameter $a_u$, which sets the strength of the inhibition of $u$ by $v$. d Results for the 4-cell configuration with communication between all cells. Bifurcations are controlled by the parameter $a_{us}$ which determines the strength of inter-cellular communication. Colors distinguish stable states with different ratios of $u$- and $v$-high cells. e,f Steady states (both stable and unstable) for the cell configurations shown in panel b without cell communication (panel e) or with cell communication (panel f). The following parameters were used: $a_u = 2.4, ~ a_v = 3.5, ~ a_s = 2, a_{us} = 1$.
  • Figure 3: Feedforward NN regression is unsuitable for GRN parameter inferencea Architecture of the feedforward NN. The mean absolute error was used for optimization. b Training data. Left: Parameter ranges used for creating simulated trajectories. Right: 10 example trajectories. c Test loss during training of the NN. d,e Ground truth parameter values (used for simulating the trajectories) versus parameter values inferred by the NN. In d, training trajectories covered the mlp as well as the bistable, differentiated regime. In e the training trajectories came exclusively from the bistable regime.
  • Figure 4: Physics informed neural network to infer gene regulatory networks. (Caption on the next page.)
  • Figure 4: Physics informed neural network to infer gene regulatory networks. (Figure on the previous page.)a Architecture of the PINN. The input to the network is time and the output consists of all dependent variables of the dynamical system. The PINN is optimized via a loss function that considers the differential equations (ODE loss), the initial conditions (IC loss) and training data (data loss). b, c, d, e The first row shows examples of training scenarios. A GRN with 4 cells that all communicate with each other was used. b Training on noise-free trajectories of all dependent variables with 25 fixed time points. c Training on trajectories shown on the left with added Gaussian noise. Only the trajectories in one cell are shown. d Training on noise-free trajectories of $u$ only. e Only the first and last time point of the $u$ trajectories in all 4 cells were used for training. The second row shows the resulting test losses. Colours indicate the different loss terms. Row 4 shows the inferred parameters and rows 3 - 6 show the approximated trajectories for the four scenarios. In the trajectory plots solid lines are trajectories approximated by the PINN and dashed lines are trajectories calculated by numerical integration using the inferred parameters.
  • ...and 2 more figures