Table of Contents
Fetching ...

Prediction of Cellular Identities from Trajectory and Cell Fate Information

Baiyang Dai, Jiamin Yang, Hari Shroff, Patrick La Riviere

TL;DR

This study addresses cell identity assignment during early C. elegans embryogenesis by predicting cell identities from a compact set of spatiotemporal features extracted from 4D time-lapse data. It compares Random Forest, MLP, and LSTM/LSTMt models, achieving over 3 0% accuracy, with RF and LSTMt surpassing $93\%$ on held-out embryos, and identifies division orientation to the mother cell (DM) as the most informative feature alongside trajectory cues. The results demonstrate that simple, trajectory-enabled features can yield high-accuracy cell classification without full lineage tracking, enabling direct cell naming in imaging sequences and offering practical utility for labeled-cell studies in nematode neural and muscle cell research. Overall, the approach provides a lightweight alternative to traditional tracking with potential applicability to similar embryonic systems in higher organisms.

Abstract

Determining cell identities in imaging sequences is an important yet challenging task. The conventional method for cell identification is via cell tracking, which is complex and can be time-consuming. In this study, we propose an innovative approach to cell identification during early $\textit{C. elegans}$ embryogenesis using machine learning. Cell identification during $\textit{C. elegans}$ embryogenesis would provide insights into neural development with implications for higher organisms including humans. We employed random forest, MLP, and LSTM models, and tested cell classification accuracy on 3D time-lapse confocal datasets spanning the first 4 hours of embryogenesis. By leveraging a small number of spatial-temporal features of individual cells, including cell trajectory and cell fate information, our models achieve an accuracy of over 91%, even with limited data. We also determine the most important feature contributions and can interpret these features in the context of biological knowledge. Our research demonstrates the success of predicting cell identities in time-lapse imaging sequences directly from simple spatio-temporal features.

Prediction of Cellular Identities from Trajectory and Cell Fate Information

TL;DR

This study addresses cell identity assignment during early C. elegans embryogenesis by predicting cell identities from a compact set of spatiotemporal features extracted from 4D time-lapse data. It compares Random Forest, MLP, and LSTM/LSTMt models, achieving over 3 0% accuracy, with RF and LSTMt surpassing on held-out embryos, and identifies division orientation to the mother cell (DM) as the most informative feature alongside trajectory cues. The results demonstrate that simple, trajectory-enabled features can yield high-accuracy cell classification without full lineage tracking, enabling direct cell naming in imaging sequences and offering practical utility for labeled-cell studies in nematode neural and muscle cell research. Overall, the approach provides a lightweight alternative to traditional tracking with potential applicability to similar embryonic systems in higher organisms.

Abstract

Determining cell identities in imaging sequences is an important yet challenging task. The conventional method for cell identification is via cell tracking, which is complex and can be time-consuming. In this study, we propose an innovative approach to cell identification during early embryogenesis using machine learning. Cell identification during embryogenesis would provide insights into neural development with implications for higher organisms including humans. We employed random forest, MLP, and LSTM models, and tested cell classification accuracy on 3D time-lapse confocal datasets spanning the first 4 hours of embryogenesis. By leveraging a small number of spatial-temporal features of individual cells, including cell trajectory and cell fate information, our models achieve an accuracy of over 91%, even with limited data. We also determine the most important feature contributions and can interpret these features in the context of biological knowledge. Our research demonstrates the success of predicting cell identities in time-lapse imaging sequences directly from simple spatio-temporal features.
Paper Structure (13 sections, 4 figures, 2 tables)

This paper contains 13 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: An example of the original embryo image (left) and the image rotated to canonical orientation (right).
  • Figure 2: Trajectories (colored lines) of ABar cell in $28$ embryos.
  • Figure 3: Model architecture for MLP and LSTM. The dashed line depicts whether extra features, like start time 'SF', lifespan 'LF', and division orientations, are used in the model.
  • Figure 4: Top-$10$ features in random forest ranked by feature importance. The blue error bars show standard deviations of feature importance for each feature. Here $\mathrm{DM_{x}}$ denotes division orientation to mother cell 'DM' along X-axis, $\mathrm{Traj_{\,x,0}}$ denotes cell trajectory 'Traj' along X-axis at time $0$, $\mathrm{Traj_{\,y,1}}$ denotes 'Traj' along Y-axis at time $1$.