Table of Contents
Fetching ...

Trackastra: Transformer-based cell tracking for live-cell microscopy

Benjamin Gallusser, Martin Weigert

TL;DR

Track-astra presents a transformer-based approach to cell tracking that learns pairwise associations directly within a short temporal window, addressing challenges from dense, dividing cell populations without relying on hand-tuned linking costs. A key innovation is the parental softmax that enforces biologically plausible parent–child relationships while permitting multi-child events, enabling effective one-to-many associations. The method achieves competitive or superior results across bacteria, nuclei, and particle datasets, and a multi-domain Track-astra model demonstrates strong cross-domain generalization, including out-of-domain HeLa data and Cell Tracking Challenge benchmarks. With simple per-object features and a scalable transformer architecture, Track-astra reduces dependence on costly optimization steps and shows promise for end-to-end detection–tracking integration and extension to 3D data, broadening applicability in live-cell imaging and beyond.

Abstract

Cell tracking is a ubiquitous image analysis task in live-cell microscopy. Unlike multiple object tracking (MOT) for natural images, cell tracking typically involves hundreds of similar-looking objects that can divide in each frame, making it a particularly challenging problem. Current state-of-the-art approaches follow the tracking-by-detection paradigm, i.e. first all cells are detected per frame and successively linked in a second step to form biologically consistent cell tracks. Linking is commonly solved via discrete optimization methods, which require manual tuning of hyperparameters for each dataset and are therefore cumbersome to use in practice. Here we propose Trackastra, a general purpose cell tracking approach that uses a simple transformer architecture to directly learn pairwise associations of cells within a temporal window from annotated data. Importantly, unlike existing transformer-based MOT pipelines, our learning architecture also accounts for dividing objects such as cells and allows for accurate tracking even with simple greedy linking, thus making strides towards removing the requirement for a complex linking step. The proposed architecture operates on the full spatio-temporal context of detections within a time window by avoiding the computational burden of processing dense images. We show that our tracking approach performs on par with or better than highly tuned state-of-the-art cell tracking algorithms for various biological datasets, such as bacteria, cell cultures and fluorescent particles. We provide code at https://github.com/weigertlab/trackastra.

Trackastra: Transformer-based cell tracking for live-cell microscopy

TL;DR

Track-astra presents a transformer-based approach to cell tracking that learns pairwise associations directly within a short temporal window, addressing challenges from dense, dividing cell populations without relying on hand-tuned linking costs. A key innovation is the parental softmax that enforces biologically plausible parent–child relationships while permitting multi-child events, enabling effective one-to-many associations. The method achieves competitive or superior results across bacteria, nuclei, and particle datasets, and a multi-domain Track-astra model demonstrates strong cross-domain generalization, including out-of-domain HeLa data and Cell Tracking Challenge benchmarks. With simple per-object features and a scalable transformer architecture, Track-astra reduces dependence on costly optimization steps and shows promise for end-to-end detection–tracking integration and extension to 3D data, broadening applicability in live-cell imaging and beyond.

Abstract

Cell tracking is a ubiquitous image analysis task in live-cell microscopy. Unlike multiple object tracking (MOT) for natural images, cell tracking typically involves hundreds of similar-looking objects that can divide in each frame, making it a particularly challenging problem. Current state-of-the-art approaches follow the tracking-by-detection paradigm, i.e. first all cells are detected per frame and successively linked in a second step to form biologically consistent cell tracks. Linking is commonly solved via discrete optimization methods, which require manual tuning of hyperparameters for each dataset and are therefore cumbersome to use in practice. Here we propose Trackastra, a general purpose cell tracking approach that uses a simple transformer architecture to directly learn pairwise associations of cells within a temporal window from annotated data. Importantly, unlike existing transformer-based MOT pipelines, our learning architecture also accounts for dividing objects such as cells and allows for accurate tracking even with simple greedy linking, thus making strides towards removing the requirement for a complex linking step. The proposed architecture operates on the full spatio-temporal context of detections within a time window by avoiding the computational burden of processing dense images. We show that our tracking approach performs on par with or better than highly tuned state-of-the-art cell tracking algorithms for various biological datasets, such as bacteria, cell cultures and fluorescent particles. We provide code at https://github.com/weigertlab/trackastra.
Paper Structure (25 sections, 10 equations, 4 figures, 6 tables)

This paper contains 25 sections, 10 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of Track-astra. Given frame-by-frame object detections in a live-cell video, object features are extracted from a small temporal window and passed as tokens into an encoder-decoder transformer, to predict pairwise associations $\hat{A}$. We apply a parental softmax normalization on $\hat{A}$ to guide the learning directly towards biologically plausible associations. Finally, we build a candidate graph by averaging the predictions $\hat{A}$ over a sliding window, and obtain a tracking solution by pruning the graph with either a greedy algorithm or discrete optimization.
  • Figure 2: Cell tracking datasets evaluated in the experiments section. a) Bacteria dataset from vanvliet2018 that shows dense colonies of growing and dividing bacteria. b) DeepCell dataset of moving and dividing cells with labeled nuclei (DynamicNuclearNet from schwartz2023). c) Vesicle dataset from the ISBI particle tracking challenge chenouard_isbi_2014 that shows synthetically generated images of fluorescently labeled particles.
  • Figure 3: Error trees on a challenging Bacteria test video. Time on the vertical axis, edges colored as true positive (green), false positive (magenta) and false negative (cyan).
  • Figure 4: Ablations on Bacteria (using ground truth detections) using only center points as features, and a LAP linker. Lower is better. We show results for three runs per model.