Table of Contents
Fetching ...

Cellular Development Follows the Path of Minimum Action

Rohola Zandie, Farhan Khodaee, Yufan Xia, Elazer R. Edelman

TL;DR

The paper addresses the challenge of quantitatively modeling cellular development as a stochastic yet rule-governed process. It introduces a least-action framework where development follows trajectories that minimize an action inferred from data, implemented via an autoregressive Transformer to learn trajectory distributions. Key contributions include interpretable metrics—entropy production, information-flow curvature, and local irreversibility—that reveal thermodynamic and informational constraints on cell fate decisions, validated on single-cell and embryonic data. This physics-inspired, data-driven approach provides a principled way to quantify developmental plasticity and constraints, with potential implications for understanding reprogramming and differentiation dynamics.

Abstract

Cellular development follows a stochastic yet rule-governed trajectory, though the underlying principles remain elusive. Here, we propose that cellular development follows paths of least action, aligning with foundational physical laws that govern dynamic systems across nature. We introduce a computational framework that takes advantage of the deep connection between the principle of least action and maximum entropy to model developmental processes using Transformers architecture. This approach enables precise quantification of entropy production, information flow curvature, and local irreversibility for developmental asymmetry in single-cell RNA sequence data. Within this unified framework, we provide interpretable metrics: entropy to capture exploration-exploitation trade-offs, curvature to assess plasticity-elasticity dynamics, and entropy production to characterize dedifferentiation and transdifferentiation. We validate our method across both single-cell and embryonic development datasets, demonstrating its ability to reveal hidden thermodynamic and informational constraints shaping cellular fate decisions.

Cellular Development Follows the Path of Minimum Action

TL;DR

The paper addresses the challenge of quantitatively modeling cellular development as a stochastic yet rule-governed process. It introduces a least-action framework where development follows trajectories that minimize an action inferred from data, implemented via an autoregressive Transformer to learn trajectory distributions. Key contributions include interpretable metrics—entropy production, information-flow curvature, and local irreversibility—that reveal thermodynamic and informational constraints on cell fate decisions, validated on single-cell and embryonic data. This physics-inspired, data-driven approach provides a principled way to quantify developmental plasticity and constraints, with potential implications for understanding reprogramming and differentiation dynamics.

Abstract

Cellular development follows a stochastic yet rule-governed trajectory, though the underlying principles remain elusive. Here, we propose that cellular development follows paths of least action, aligning with foundational physical laws that govern dynamic systems across nature. We introduce a computational framework that takes advantage of the deep connection between the principle of least action and maximum entropy to model developmental processes using Transformers architecture. This approach enables precise quantification of entropy production, information flow curvature, and local irreversibility for developmental asymmetry in single-cell RNA sequence data. Within this unified framework, we provide interpretable metrics: entropy to capture exploration-exploitation trade-offs, curvature to assess plasticity-elasticity dynamics, and entropy production to characterize dedifferentiation and transdifferentiation. We validate our method across both single-cell and embryonic development datasets, demonstrating its ability to reveal hidden thermodynamic and informational constraints shaping cellular fate decisions.

Paper Structure

This paper contains 17 sections, 27 equations, 6 figures.

Figures (6)

  • Figure 1: A. The Waddington landscape represented as a high-dimensional manifold ($M$). Different geodesic trajectories are associated with varying probabilities, indicated by the thickness of each trajectory. These trajectories can diverge (at $1$), merge (at $3$), or even locally reverse direction (at $2$). The tangent space $T_p(M)$ is illustrated for an arbitrary point $p$ along the geodesic trajectory $\gamma(t)$. B. Various curvature scenarios in a multipartite graph. In the leftmost panel, negative curvature represents a bottleneck in information flow between nodes $i$ and $j$ which is crucial for passing information between the layers (Bridge). The middle panel illustrates positive curvature, where multiple shortcut paths exist, with $(i,j)$ being just one of them, this shows robustness in information flow (Hub). The rightmost panel depicts zero curvature, indicating a neutral information flow structure. $\textbf{C.}$ The model architecture used in this study is based on an autoregressive transformer model composed of stacked decoder layers. The inputs at each time point include both cell expression data and cell type information, which are summed and used to predict the next cell expression state.
  • Figure 2: A. Cell reprogramming dynamics over time and across cell types, visualized using a directed force algorithm. B. Accuracy decreases as coverage increases, but there exists an optimal balance where both accuracy and coverage remain high. C., D. Sample Training Results of the Autoregressive Model. The overall trend indicates that as training progresses and overfitting occurs, accuracy declines while coverage increases. Higher Top$-k$ values and temperature $T$ improve coverage but negatively impact accuracy.
  • Figure 3: The training of Transformer Autoregressive model for predicting the cell development on evaluation set of Reprogramming of mouse embryonic fibroblasts dataset. As the training progresses the accuracy decreases as the coverage increases
  • Figure 4: Normalized Entropy Across Different Time Steps A. Normalized Entropy of different cell types across time steps at fixed temperature ($T=0.1$): MET is the highest while IPS has the lowest average entropy (Empty boxes indicate missing data due to scarcity). B. Entropy fluctuations on the temperature scale $[0, 1]$. Higher temperatures smooth out differences in next-cell selection, while lower temperatures make these differences more pronounced. C. A zoomed-in view of normalized entropy in the $[0, 0.1]$ range, revealing finer details of fluctuations. Notably, two valleys appear at time steps 12 and 24.
  • Figure 5: Curvature Analysis of Cell Development in the Evaluation Set of Mouse Embryonic Reprogramming. A. The average curvature across time steps and cell types reveals distinct differences between both cell types and developmental stages. B. Cell development trajectories are color-coded based on Balanced Forman curvature. Initially, curvature remains mostly flat ($1$). In most trajectories, particularly during intermediate time steps ($2$), curvature becomes negative, indicating the oversquashing phenomenon, characteristic of Bridges. In contrast, regions with positive curvature ($3$) predominantly appear in Stromal and Neural cell types, suggesting excessive connectivity, characteristic of Hubs.
  • ...and 1 more figures