Table of Contents
Fetching ...

Understanding Cell Fate Decisions with Temporal Attention

Florian Bürger, Martim Dias Gomes, Adrián E. Granada, Noémie Moreau, Katarzyna Bozek

Abstract

Understanding non-genetic determinants of cell fate is critical for developing and improving cancer therapies, as genetically identical cells can exhibit divergent outcomes under the same treatment conditions. In this work, we present a deep learning approach for cell fate prediction from raw long-term live-cell recordings of cancer cell populations under chemotherapeutic treatment. Our Transformer model is trained to predict cell fate directly from raw image sequences, without relying on predefined morphological or molecular features. Beyond classification, we introduce a comprehensive explainability framework for interpreting the temporal and morphological cues guiding the model's predictions. We demonstrate that prediction of cell outcomes is possible based on the video only, our model achieves balanced accuracy of 0.94 and an F1-score of 0.93. Attention and masking experiments further indicate that the signal predictive of the cell fate is not uniquely located in the final frames of a cell trajectory, as reliable predictions are possible up to 10 h before the event. Our analysis reveals distinct temporal distribution of predictive information in the mitotic and apoptotic sequences, as well as the role of cell morphology and p53 signaling in determining cell outcomes. Together, these findings demonstrate that attention-based temporal models enable accurate cell fate prediction while providing biologically interpretable insights into non-genetic determinants of cellular decision-making. The code is available at https://github.com/bozeklab/Cell-Fate-Prediction.

Understanding Cell Fate Decisions with Temporal Attention

Abstract

Understanding non-genetic determinants of cell fate is critical for developing and improving cancer therapies, as genetically identical cells can exhibit divergent outcomes under the same treatment conditions. In this work, we present a deep learning approach for cell fate prediction from raw long-term live-cell recordings of cancer cell populations under chemotherapeutic treatment. Our Transformer model is trained to predict cell fate directly from raw image sequences, without relying on predefined morphological or molecular features. Beyond classification, we introduce a comprehensive explainability framework for interpreting the temporal and morphological cues guiding the model's predictions. We demonstrate that prediction of cell outcomes is possible based on the video only, our model achieves balanced accuracy of 0.94 and an F1-score of 0.93. Attention and masking experiments further indicate that the signal predictive of the cell fate is not uniquely located in the final frames of a cell trajectory, as reliable predictions are possible up to 10 h before the event. Our analysis reveals distinct temporal distribution of predictive information in the mitotic and apoptotic sequences, as well as the role of cell morphology and p53 signaling in determining cell outcomes. Together, these findings demonstrate that attention-based temporal models enable accurate cell fate prediction while providing biologically interpretable insights into non-genetic determinants of cellular decision-making. The code is available at https://github.com/bozeklab/Cell-Fate-Prediction.
Paper Structure (20 sections, 2 equations, 6 figures)

This paper contains 20 sections, 2 equations, 6 figures.

Figures (6)

  • Figure 1: Overview of the proposed cell fate prediction framework and the explainability using temporal attention analysis. 1) From time-lapse microscopy recordings, we extract single-cell trajectories. Each trajectory starts at the first appearance of a cell and ends at the last frame prior to the fate event (division or death). 2) For each cell patch, frame-level embeddings are computed using a ResNet-50 backbone. We randomly mask a portion of frames at the end of each trajectory during training. 3) The sequence of frame embeddings is processed by a Transformer encoder, which integrates temporal information to predict the final cell fate. 4) To enable interpretability, we extract the attention weights assigned to individual frames and categorize them into high-attention and low-attention frames. 5) In parallel, we compute handcrafted cell features for each patch (e.g., area, eccentricity, p53 concentration). 6) Finally, we compare the feature distributions of high- and low-attention frames to identify biologically meaningful patterns associated with the model’s decision-making process.
  • Figure 2: Confusion matrix normalized over the ground-truth labels, with absolute counts shown in parentheses.
  • Figure 3: Temporal truncation analysis to assess the importance of late parts of the cell trajectory for its fate prediction. a) Balanced accuracy when progressively truncating sequences by removing an increasing number of frames from the end of each trajectory. b) Balanced accuracy when restricting the model input to only the last $k$ frames of each sequence. c) Class-wise recall under progressive truncation from the sequence end, reported separately for apoptosis and mitosis. d) Class-wise recall when predictions are based solely on the last $k$ frames, highlighting the contribution of late temporal information for each cell fate.
  • Figure 4: Aggregated attention weights across correctly classified sequences, aligned to the final frame and normalized by the global minimum and maximum attention values.
  • Figure 5: Effect size analysis based on Cliff’s delta for handcrafted cell features. For each feature, the median Cliff’s delta and its 95% confidence interval are shown separately for mitosis and apoptosis. Positive values indicate higher feature values in high-attention frames compared to low-attention frames, whereas negative values indicate the opposite. The background shading denotes commonly used effect size categories (negligible, small, medium, and large effects), facilitating interpretation of the magnitude and direction of the observed differences.
  • ...and 1 more figures