Table of Contents
Fetching ...

Mapping gene expression dynamics to developmental phenotypes with information entropy analysis

Ben Ansbacher, Malachy Guzman, Jordi Garcia-Ojalvo, Arjendu K Pattanayak

TL;DR

The paper investigates how time-dependent gene expression dynamics map onto macroscopic developmental phenotypes by framing development as an irreversible thermodynamic trajectory between non-equilibrium states. It introduces an information-theoretic pipeline, notably the permutation entropy of an Indexed ensemble $\Pi(t)$, along with $\bar{\mu}$, $H$, and $A_{KL}$, applied to bulk RNA-seq data from 30 developmental time points in Drosophila embryos. The authors show that $\Pi(t)$ and $A_{KL}$ corral the developmental progression, with $\Pi(t)$ displaying a robust correlation with the mean developmental stage and revealing irreversibility across the embryogenesis timeline. They further dissect gene-level dynamics by functional classes, identifying a major irreversibility event near dorsal closure and class-specific entropy patterns, suggesting a principled link between microscopic transcriptional complexity and macroscopic developmental transitions. The work demonstrates the utility of information-complexity analysis for connecting biomarker dynamics to large-scale developmental trajectories and outlines paths toward higher-resolution, mechanistic integration with gene-regulatory networks and cross-species generalization.

Abstract

The development of multicellular organisms entails a deep connection between time-dependent biochemical processes taking place at the subcellular level, and the resulting macroscopic phenotypes that arise in populations of up to trillions of cells. A statistical mechanics of developmental processes would help to understand how microscopic genotypes map onto macroscopic phenotypes, a general goal across biology. Here we follow this approach, hypothesizing that development should be understood as a thermodynamic transition between non-equilibrium states. We test this hypothesis in the context of the fruit fly, Drosophila melanogaster, a model organism used widely in genetics and developmental biology for over a century. Applying a variety of information-theoretic measures to public transcriptomics datasets of whole fly embryos during development, we show that the global temporal dynamics of gene expression can be understood as a process that probabilistically guides embryonic dynamics across macroscopic phenotypic stages. In particular, we demonstrate signatures of irreversibility in the information complexity of transcriptomic dynamics, as measured mainly by the permutation entropy of indexed ensembles (PI entropy). Our results show that the dynamics of PI entropy correlate strongly with developmental stages. Overall, this is a test case in applying information complexity analysis to relate the statistical mechanics of biomarkers to macroscopic developmental dynamics.

Mapping gene expression dynamics to developmental phenotypes with information entropy analysis

TL;DR

The paper investigates how time-dependent gene expression dynamics map onto macroscopic developmental phenotypes by framing development as an irreversible thermodynamic trajectory between non-equilibrium states. It introduces an information-theoretic pipeline, notably the permutation entropy of an Indexed ensemble , along with , , and , applied to bulk RNA-seq data from 30 developmental time points in Drosophila embryos. The authors show that and corral the developmental progression, with displaying a robust correlation with the mean developmental stage and revealing irreversibility across the embryogenesis timeline. They further dissect gene-level dynamics by functional classes, identifying a major irreversibility event near dorsal closure and class-specific entropy patterns, suggesting a principled link between microscopic transcriptional complexity and macroscopic developmental transitions. The work demonstrates the utility of information-complexity analysis for connecting biomarker dynamics to large-scale developmental trajectories and outlines paths toward higher-resolution, mechanistic integration with gene-regulatory networks and cross-species generalization.

Abstract

The development of multicellular organisms entails a deep connection between time-dependent biochemical processes taking place at the subcellular level, and the resulting macroscopic phenotypes that arise in populations of up to trillions of cells. A statistical mechanics of developmental processes would help to understand how microscopic genotypes map onto macroscopic phenotypes, a general goal across biology. Here we follow this approach, hypothesizing that development should be understood as a thermodynamic transition between non-equilibrium states. We test this hypothesis in the context of the fruit fly, Drosophila melanogaster, a model organism used widely in genetics and developmental biology for over a century. Applying a variety of information-theoretic measures to public transcriptomics datasets of whole fly embryos during development, we show that the global temporal dynamics of gene expression can be understood as a process that probabilistically guides embryonic dynamics across macroscopic phenotypic stages. In particular, we demonstrate signatures of irreversibility in the information complexity of transcriptomic dynamics, as measured mainly by the permutation entropy of indexed ensembles (PI entropy). Our results show that the dynamics of PI entropy correlate strongly with developmental stages. Overall, this is a test case in applying information complexity analysis to relate the statistical mechanics of biomarkers to macroscopic developmental dynamics.

Paper Structure

This paper contains 9 sections, 2 equations, 11 figures.

Figures (11)

  • Figure 1: Time evolution of the expression distribution. These panels show the histograms of gene expression as a function of expression level as a function of timepoint. The data includes genes both with and without highly variable expression. The black curves ('envelopes') are obtained from standard kernel density estimates (see Methods).
  • Figure 2: Statistical and information-theoretic properties of the gene ensemble expression as a function of time. (A) Mean expression value $\bar{\mu}$, (B) Shannon Entropy $H(t)$, (C) KL distance $A_{KL}$, and (D) PI entropy $\Pi(t)$. Mean expression is computed on relative gene expression values with the $\log_2$ transformation reversed to show greater dynamic range, but all genes are included in the analysis.
  • Figure 3: Relative expression values for each gene indexed by its final expression value. The panels here show the relative expression value for all genes as a function of time, where the gene is located along the horizontal axis (or indexed) according to its rank-ordered value as measured during the 23-24 hr window. We see the distribution settle down in overall range and relative smoothness, as quantified by $\Pi(t)$. Original expression values were reported with a $\log_2$ transform, here we have undone that.
  • Figure 4: Relationship between time-development of ensemble properties and macroscopic or phenotypic properties. A) Proportion of sampled embryos in each of the 17 phenotypic stages.The black line indicates $M(t)$, the mean stage across all embryos at a given time, normalized to [0,1]. B) Mean stage (dark blue) compared to normalized Shannon entropy and PI entropy computed over all genes. PI is reported as 1-PI to match the stage progression. Note that in this figure time-points at $1/2$ hour marks are also being used. See text for details.
  • Figure 5: Relationships and mutual information between macroscopic phenotypic data and information-theoretic measures of gene expression data. These panels compare normalized mean embryonic stage $M(t)$ of the population against (A) Shannon entropy $H$, (B) Mean expression level $\bar{\mu}$, (C) Approach Kullback-Leibler distance $A_{KL}$, and (D) PI entropy $\Pi$.
  • ...and 6 more figures