Table of Contents
Fetching ...

Recovering Time-Varying Networks From Single-Cell Data

Euxhen Hasanaj, Barnabás Póczos, Ziv Bar-Joseph

TL;DR

A deep neural network, Marlene, is developed to infer dynamic graphs from time series single-cell gene expression data and can identify gene interactions relevant to specific biological responses, including COVID-19 immune response, fibrosis, and aging, paving the way for potential treatments.

Abstract

Gene regulation is a dynamic process that underlies all aspects of human development, disease response, and other key biological processes. The reconstruction of temporal gene regulatory networks has conventionally relied on regression analysis, graphical models, or other types of relevance networks. With the large increase in time series single-cell data, new approaches are needed to address the unique scale and nature of this data for reconstructing such networks. Here, we develop a deep neural network, Marlene, to infer dynamic graphs from time series single-cell gene expression data. Marlene constructs directed gene networks using a self-attention mechanism where the weights evolve over time using recurrent units. By employing meta learning, the model is able to recover accurate temporal networks even for rare cell types. In addition, Marlene can identify gene interactions relevant to specific biological responses, including COVID-19 immune response, fibrosis, and aging.

Recovering Time-Varying Networks From Single-Cell Data

TL;DR

A deep neural network, Marlene, is developed to infer dynamic graphs from time series single-cell gene expression data and can identify gene interactions relevant to specific biological responses, including COVID-19 immune response, fibrosis, and aging, paving the way for potential treatments.

Abstract

Gene regulation is a dynamic process that underlies all aspects of human development, disease response, and other key biological processes. The reconstruction of temporal gene regulatory networks has conventionally relied on regression analysis, graphical models, or other types of relevance networks. With the large increase in time series single-cell data, new approaches are needed to address the unique scale and nature of this data for reconstructing such networks. Here, we develop a deep neural network, Marlene, to infer dynamic graphs from time series single-cell gene expression data. Marlene constructs directed gene networks using a self-attention mechanism where the weights evolve over time using recurrent units. By employing meta learning, the model is able to recover accurate temporal networks even for rare cell types. In addition, Marlene can identify gene interactions relevant to specific biological responses, including COVID-19 immune response, fibrosis, and aging.
Paper Structure (18 sections, 8 equations, 7 figures, 1 table)

This paper contains 18 sections, 8 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Overview of Marlene. Marlene takes as input gene expression data in the form of a cell-by-gene matrix. It then performs gene featurization via the pooling by multihead attention (PMA) mechanism which returns a gene feature matrix. This matrix is then inputted into a self-attention module to obtain a gene network in the form of an adjacency matrix. The weights of the self-attention module evolve from one time point to the next via a gated recurrent unit (GRU). The expression of transcription factors and the recovered graph are used to reconstruct the full gene expression vector. Finally, the reconstructed matrix is used to predict the cell type for the batch. The network is trained in a model-agnostic meta-learning fashion where each cell type is treated as a "task" to be learned, thus enabling the model to quickly adapt to cell types with low representation.
  • Figure 2: Overlap analysis of the SARS-CoV-2 vaccination dataset. Showing $-\log_{10}(\text{FDR})$ values from a Fisher's exact test measuring the overlap between predicted TF-gene interactions in reconstructed networks and two TF-gene interaction databases (TRRUST, RegNetwork). Best performing method is starred.
  • Figure 3: Temporal analysis of the predicted gene regulatory networks for the SARS-CoV-2 vaccine dataset. (a) Intersection-over-union (IoU) scores between consecutive graphs. (b) For each method, top 3 MSigDB terms enriched for genes that were regulated at day 2 but not day 0.
  • Figure 4: Results on the HLCA dataset. (a) FDR corrected $p$-values of Fisher exact tests reflecting the number of links that overlap with TRRUST and RegNetwork databases. (b) Top 3 Jensen Diseases terms enriched for genes added between the first and second age group.
  • Figure 5: Enrichment for senescence using the SenMayo set. For 4 cell types, there was statistically significant enrichment for the oldest age group. We only used the top 200 regulated genes.
  • ...and 2 more figures