Table of Contents
Fetching ...

Single molecule localization microscopy challenge: a biologically inspired benchmark for long-sequence modeling

Fatemeh Valeh, Monika Farsang, Radu Grosu, Gerhard Schütz

Abstract

State space models (SSMs) have recently achieved strong performance on long sequence modeling tasks while offering improved memory and computational efficiency compared to transformer based architectures. However, their evaluation has been largely limited to synthetic benchmarks and application domains such as language and audio, leaving their behavior on sparse and stochastic temporal processes in biological imaging unexplored. In this work, we introduce the Single Molecule Localization Microscopy Challenge (SMLM-C), a benchmark dataset consisting of ten SMLM simulations spanning dSTORM and DNA-PAINT modalities with varying hyperparameter designed to evaluate state space models on biologically realistic spatiotemporal point process data with known ground truth. Using a controlled subset of these simulations, we evaluate state space models and find that performance degrades substantially as temporal discontinuity increases, revealing fundamental challenges in modeling heavy-tailed blinking dynamics. These results highlight the need for sequence models better suited to sparse, irregular temporal processes encountered in real world scientific imaging data.

Single molecule localization microscopy challenge: a biologically inspired benchmark for long-sequence modeling

Abstract

State space models (SSMs) have recently achieved strong performance on long sequence modeling tasks while offering improved memory and computational efficiency compared to transformer based architectures. However, their evaluation has been largely limited to synthetic benchmarks and application domains such as language and audio, leaving their behavior on sparse and stochastic temporal processes in biological imaging unexplored. In this work, we introduce the Single Molecule Localization Microscopy Challenge (SMLM-C), a benchmark dataset consisting of ten SMLM simulations spanning dSTORM and DNA-PAINT modalities with varying hyperparameter designed to evaluate state space models on biologically realistic spatiotemporal point process data with known ground truth. Using a controlled subset of these simulations, we evaluate state space models and find that performance degrades substantially as temporal discontinuity increases, revealing fundamental challenges in modeling heavy-tailed blinking dynamics. These results highlight the need for sequence models better suited to sparse, irregular temporal processes encountered in real world scientific imaging data.
Paper Structure (32 sections, 1 equation, 4 figures, 3 tables)

This paper contains 32 sections, 1 equation, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Qualitative median-case comparison on dSTORM simulations. Ground-truth emitter positions (red $\circ$) and model predictions (blue $\times$) for representative median-difficulty samples from dSTORM with short off-time ($\mu_{\text{off}} = 100$, top row) and long off-time ($\mu_{\text{off}} = 1000$, bottom row). Colored background points show the raw observed localizations, where each color corresponds to a distinct emitter identity (ID). Columns correspond to S5 Small, S5 Large, Mamba-2 Small, and Mamba-2 Large. Annotations $\times k$ indicate $k$ predicted emitter locations lying within 20 nm of each other, and are placed for improved visibility.
  • Figure 2: Training and validation loss curves for S5 and Mamba-2 models on dSTORM-Sim2 and dSTORM-Sim4. Curves are averaged over three random seeds; shaded regions indicate one standard deviation. Models are selected based on minimum validation localization error (Hungarian error).
  • Figure 3: Qualitative reconstructions for representative easy test sequences (lowest 10% of averaged Chamfer loss). Predictions from S5 and Mamba-2 models are shown alongside raw input frames and ground-truth emitter locations.
  • Figure 4: Qualitative reconstructions for representative hard test sequences (highest 5% of averaged Chamfer loss). These examples highlight failure modes.