Table of Contents
Fetching ...

Cell Instance Segmentation: The Devil Is in the Boundaries

Peixian Liang, Yifan Ding, Yizhe Zhang, Jianxu Chen, Hao Zheng, Hongxiao Wang, Yejia Zhang, Guangyu Meng, Tim Weninger, Michael Niemier, X. Sharon Hu, Danny Z Chen

TL;DR

This work addresses the challenge of cell instance segmentation by moving beyond pixel-wise clustering to boundary-focused clustering. It introduces Ceb, a boundary-based framework that uses boundary signatures and a boundary classifier, grounded in a revised Watershed to generate candidate boundaries, enabling better preservation of shape and curvature. A GI-matching training scheme and an optional temporal extension (Ceb+Temporal) for videos are proposed to refine boundary labeling and enforce temporal consistency. Across six public datasets, Ceb consistently outperforms foreground-pixel clustering baselines and competes with state-of-the-art methods, with temporal consistency providing additional gains in tracking tasks. The approach highlights the value of boundary geometry in accurate cell instance segmentation and demonstrates practical benefits for biomedical image analysis and cell tracking.

Abstract

State-of-the-art (SOTA) methods for cell instance segmentation are based on deep learning (DL) semantic segmentation approaches, focusing on distinguishing foreground pixels from background pixels. In order to identify cell instances from foreground pixels (e.g., pixel clustering), most methods decompose instance information into pixel-wise objectives, such as distances to foreground-background boundaries (distance maps), heat gradients with the center point as heat source (heat diffusion maps), and distances from the center point to foreground-background boundaries with fixed angles (star-shaped polygons). However, pixel-wise objectives may lose significant geometric properties of the cell instances, such as shape, curvature, and convexity, which require a collection of pixels to represent. To address this challenge, we present a novel pixel clustering method, called Ceb (for Cell boundaries), to leverage cell boundary features and labels to divide foreground pixels into cell instances. Starting with probability maps generated from semantic segmentation, Ceb first extracts potential foreground-foreground boundaries with a revised Watershed algorithm. For each boundary candidate, a boundary feature representation (called boundary signature) is constructed by sampling pixels from the current foreground-foreground boundary as well as the neighboring background-foreground boundaries. Next, a boundary classifier is used to predict its binary boundary label based on the corresponding boundary signature. Finally, cell instances are obtained by dividing or merging neighboring regions based on the predicted boundary labels. Extensive experiments on six datasets demonstrate that Ceb outperforms existing pixel clustering methods on semantic segmentation probability maps. Moreover, Ceb achieves highly competitive performance compared to SOTA cell instance segmentation methods.

Cell Instance Segmentation: The Devil Is in the Boundaries

TL;DR

This work addresses the challenge of cell instance segmentation by moving beyond pixel-wise clustering to boundary-focused clustering. It introduces Ceb, a boundary-based framework that uses boundary signatures and a boundary classifier, grounded in a revised Watershed to generate candidate boundaries, enabling better preservation of shape and curvature. A GI-matching training scheme and an optional temporal extension (Ceb+Temporal) for videos are proposed to refine boundary labeling and enforce temporal consistency. Across six public datasets, Ceb consistently outperforms foreground-pixel clustering baselines and competes with state-of-the-art methods, with temporal consistency providing additional gains in tracking tasks. The approach highlights the value of boundary geometry in accurate cell instance segmentation and demonstrates practical benefits for biomedical image analysis and cell tracking.

Abstract

State-of-the-art (SOTA) methods for cell instance segmentation are based on deep learning (DL) semantic segmentation approaches, focusing on distinguishing foreground pixels from background pixels. In order to identify cell instances from foreground pixels (e.g., pixel clustering), most methods decompose instance information into pixel-wise objectives, such as distances to foreground-background boundaries (distance maps), heat gradients with the center point as heat source (heat diffusion maps), and distances from the center point to foreground-background boundaries with fixed angles (star-shaped polygons). However, pixel-wise objectives may lose significant geometric properties of the cell instances, such as shape, curvature, and convexity, which require a collection of pixels to represent. To address this challenge, we present a novel pixel clustering method, called Ceb (for Cell boundaries), to leverage cell boundary features and labels to divide foreground pixels into cell instances. Starting with probability maps generated from semantic segmentation, Ceb first extracts potential foreground-foreground boundaries with a revised Watershed algorithm. For each boundary candidate, a boundary feature representation (called boundary signature) is constructed by sampling pixels from the current foreground-foreground boundary as well as the neighboring background-foreground boundaries. Next, a boundary classifier is used to predict its binary boundary label based on the corresponding boundary signature. Finally, cell instances are obtained by dividing or merging neighboring regions based on the predicted boundary labels. Extensive experiments on six datasets demonstrate that Ceb outperforms existing pixel clustering methods on semantic segmentation probability maps. Moreover, Ceb achieves highly competitive performance compared to SOTA cell instance segmentation methods.

Paper Structure

This paper contains 41 sections, 16 equations, 9 figures, 6 tables, 2 algorithms.

Figures (9)

  • Figure 1: An overview of our Ceb framework. An input image is fed to a CNN network (e.g., U-Net) to produce a semantic probability map. Step (1) seed generation generates seeds from the probability map. Step (2) boundary generation uses these seeds and the probability map to produce possible boundaries and the divided regions. Step (3) boundary label assignment matches ground truth instance masks and the divided regions to attain true regions and their corresponding boundaries (as true boundaries). Step (4) boundary signature extraction generates boundary-based feature representations, boundary signatures, for all possible boundaries. During training, these boundary signatures, along with their corresponding true/false boundary labels obtained in Step (3), are fed to Step (5) boundary classification to train a boundary classifier. During inference, the boundary classifier predicts a true/false label for each possible boundary based on its boundary signature. The final instance results are obtained by merging connected regions divided by false boundaries.
  • Figure 2: The process of generating cell instance candidates from the possible regions and boundaries. Given the boundaries and regions (a), an undirected graph is built (b), in which the regions are represented as nodes and boundaries as edges. By enumerating all possible connected subgraphs, all instance candidates are obtained (c).
  • Figure 3: The optimal instance matching model for boundary label assignment. We consider instance-level matching between ground truth (GT) instance masks (a) and the generated boundaries and regions (e). The GT labels are decomposed into individual ground truth instances (b); possible cell instance candidates (c) are generated by considering all possible connected subgraphs (d) in the graph (e). Dashed lines indicate the matching results (two instances are selected). The boundaries inside the matched instances are assigned false labels, and those enclosing the matched instances are assigned true labels.
  • Figure 4: The process of extracting boundary signatures. Given extracted boundaries and divided regions (a), we produce region-region boundaries (c), and obtain all possible boundaries, called the boundary codebook (b) which includes region-region boundaries and foreground-background boundaries. For each region-region boundary, we locate its two endpoints (d) and select a fork road around each endpoint including the corresponding boundary and two neighboring boundaries from the boundary codebook (b). We then sample pixels from the two fork roads and transform them into a binary mask to obtain a boundary signature (two boundary signatures are in (e)).
  • Figure 5: An example illustrating our Ceb+Temporal method applied to three consecutive frames, $w-1$, $w$, and $w+1$. First, the Ceb method generates regions, boundaries, and the associated boundary probability scores (shown as numerical values). Next, high-confidence boundaries (e.g., with scores $\geq 0.9$) are selected and the corresponding cell instances (attached only with high-confidence boundaries) are selected to form the initial state (marked in green at $\mathrm{iter}\ 0$). In the 1st iteration ($\mathrm{iter}\ 1$), the selected instances from frame $w+1$ are propagated to frame $w$, allowing two previously unselected instances in frame $w$ to be chosen. In the 2nd iteration ($\mathrm{iter}\ 2$), instances from frame $w$ are propagated to frame $w-1$, enabling the selection of two additional instances in that frame. All the selected instance candidates constitute the final instance segmentation results.
  • ...and 4 more figures