CellStream: Dynamical Optimal Transport Informed Embeddings for Reconstructing Cellular Trajectories from Snapshots Data
Yue Ling, Peiqi Zhang, Zhenyi Zhang, Peijie Zhou
TL;DR
CellStream addresses the challenge of reconstructing continuous cellular trajectories from sparse, noisy time-series scRNA-seq snapshots by jointly learning a dynamics-informed embedding and latent cellular dynamics. It combines an autoencoder with unbalanced dynamical OT, optimizing a loss that includes embedding fidelity and a Wasserstein–Fisher–Rao–based transport term in latent space, with dynamics governed by a velocity field $\mathbf{v}$ and growth term $g$ under the continuity equation $\partial_t q + \nabla_{\mathbf{z}}\cdot(\mathbf{v} q) = g q$. The framework demonstrates superior temporal coherence and noise robustness across simulated bifurcations and real datasets (EMT, iPSC) and extends to spatiotemporal contexts via spatial transcriptomics (MOSTA), outperforming state-of-the-art baselines. This end-to-end approach enables faithful reconstruction of dynamic cellular processes directly from static snapshots, with potential to inform regulatory inference and multi-omics integration.
Abstract
Single-cell RNA sequencing (scRNA-seq), especially temporally resolved datasets, enables genome-wide profiling of gene expression dynamics at single-cell resolution across discrete time points. However, current technologies provide only sparse, static snapshots of cell states and are inherently influenced by technical noise, complicating the inference and representation of continuous transcriptional dynamics. Although embedding methods can reduce dimensionality and mitigate technical noise, the majority of existing approaches typically treat trajectory inference separately from embedding construction, often neglecting temporal structure. To address this challenge, here we introduce CellStream, a novel deep learning framework that jointly learns embedding and cellular dynamics from single-cell snapshot data by integrating an autoencoder with unbalanced dynamical optimal transport. Compared to existing methods, CellStream generates dynamics-informed embeddings that robustly capture temporal developmental processes while maintaining high consistency with the underlying data manifold. We demonstrate CellStream's effectiveness on both simulated datasets and real scRNA-seq data, including spatial transcriptomics. Our experiments indicate significant quantitative improvements over state-of-the-art methods in representing cellular trajectories with enhanced temporal coherence and reduced noise sensitivity. Overall, CellStream provides a new tool for learning and representing continuous streams from the noisy, static snapshots of single-cell gene expression.
