Table of Contents
Fetching ...

Cyto R-CNN and CytoNuke Dataset: Towards reliable whole-cell segmentation in bright-field histological images

Johannes Raufeisen, Kunpeng Xie, Fabian Hörst, Till Braunschweig, Jianning Li, Jens Kleesiek, Rainer Röhrig, Jan Egger, Bastian Leibe, Frank Hölzle, Alexander Hermans, Behrus Puladi

TL;DR

This work tackles the challenge of reliable whole-cell segmentation in bright-field histology by introducing Cyto R-CNN, a two-branch Mask R-CNN-based architecture that jointly segments nuclei and cytoplasm. It is trained and evaluated on the CytoNuke dataset, a public HE-stained HNSCC dataset with nucleus and cytoplasm annotations, enabling fair comparisons with QuPath, StarDist, and Cellpose. Cyto R-CNN achieves the highest whole-cell AP50 (58.65%) and AP75 (11.56%) and shows the best overall agreement with manual cell measurements (average KS distance $\bar{D}=0.15$), outperforming baselines. The dataset release and method advance could improve digital pathology workflows by providing more reliable morphometric measurements, and the authors suggest expanding the approach to more cell types and stains in future work.

Abstract

Background: Cell segmentation in bright-field histological slides is a crucial topic in medical image analysis. Having access to accurate segmentation allows researchers to examine the relationship between cellular morphology and clinical observations. Unfortunately, most segmentation methods known today are limited to nuclei and cannot segmentate the cytoplasm. Material & Methods: We present a new network architecture Cyto R-CNN that is able to accurately segment whole cells (with both the nucleus and the cytoplasm) in bright-field images. We also present a new dataset CytoNuke, consisting of multiple thousand manual annotations of head and neck squamous cell carcinoma cells. Utilizing this dataset, we compared the performance of Cyto R-CNN to other popular cell segmentation algorithms, including QuPath's built-in algorithm, StarDist and Cellpose. To evaluate segmentation performance, we calculated AP50, AP75 and measured 17 morphological and staining-related features for all detected cells. We compared these measurements to the gold standard of manual segmentation using the Kolmogorov-Smirnov test. Results: Cyto R-CNN achieved an AP50 of 58.65% and an AP75 of 11.56% in whole-cell segmentation, outperforming all other methods (QuPath $19.46/0.91\%$; StarDist $45.33/2.32\%$; Cellpose $31.85/5.61\%$). Cell features derived from Cyto R-CNN showed the best agreement to the gold standard ($\bar{D} = 0.15$) outperforming QuPath ($\bar{D} = 0.22$), StarDist ($\bar{D} = 0.25$) and Cellpose ($\bar{D} = 0.23$). Conclusion: Our newly proposed Cyto R-CNN architecture outperforms current algorithms in whole-cell segmentation while providing more reliable cell measurements than any other model. This could improve digital pathology workflows, potentially leading to improved diagnosis. Moreover, our published dataset can be used to develop further models in the future.

Cyto R-CNN and CytoNuke Dataset: Towards reliable whole-cell segmentation in bright-field histological images

TL;DR

This work tackles the challenge of reliable whole-cell segmentation in bright-field histology by introducing Cyto R-CNN, a two-branch Mask R-CNN-based architecture that jointly segments nuclei and cytoplasm. It is trained and evaluated on the CytoNuke dataset, a public HE-stained HNSCC dataset with nucleus and cytoplasm annotations, enabling fair comparisons with QuPath, StarDist, and Cellpose. Cyto R-CNN achieves the highest whole-cell AP50 (58.65%) and AP75 (11.56%) and shows the best overall agreement with manual cell measurements (average KS distance ), outperforming baselines. The dataset release and method advance could improve digital pathology workflows by providing more reliable morphometric measurements, and the authors suggest expanding the approach to more cell types and stains in future work.

Abstract

Background: Cell segmentation in bright-field histological slides is a crucial topic in medical image analysis. Having access to accurate segmentation allows researchers to examine the relationship between cellular morphology and clinical observations. Unfortunately, most segmentation methods known today are limited to nuclei and cannot segmentate the cytoplasm. Material & Methods: We present a new network architecture Cyto R-CNN that is able to accurately segment whole cells (with both the nucleus and the cytoplasm) in bright-field images. We also present a new dataset CytoNuke, consisting of multiple thousand manual annotations of head and neck squamous cell carcinoma cells. Utilizing this dataset, we compared the performance of Cyto R-CNN to other popular cell segmentation algorithms, including QuPath's built-in algorithm, StarDist and Cellpose. To evaluate segmentation performance, we calculated AP50, AP75 and measured 17 morphological and staining-related features for all detected cells. We compared these measurements to the gold standard of manual segmentation using the Kolmogorov-Smirnov test. Results: Cyto R-CNN achieved an AP50 of 58.65% and an AP75 of 11.56% in whole-cell segmentation, outperforming all other methods (QuPath ; StarDist ; Cellpose ). Cell features derived from Cyto R-CNN showed the best agreement to the gold standard () outperforming QuPath (), StarDist () and Cellpose (). Conclusion: Our newly proposed Cyto R-CNN architecture outperforms current algorithms in whole-cell segmentation while providing more reliable cell measurements than any other model. This could improve digital pathology workflows, potentially leading to improved diagnosis. Moreover, our published dataset can be used to develop further models in the future.
Paper Structure (19 sections, 5 figures, 3 tables)

This paper contains 19 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A sample image from the CytoNuke dataset. Tumor nuclei annotations are shown in yellow, tumor cell annotations are shown in blue. Not every nucleus annotation has a corresponding cell annotation, since cell boundaries are not always clearly distinguishable.
  • Figure 2: Architectural overview of Cyto R-CNN. The backbone and RPN are trained on nuclei only. The nuclei proposals are then forwarded to two different branches. The first branch will perform a regular bounding box and mask regression for the nucleus. The second branch will scale the nucleus proposal and perform mask regression for the whole cell, including the cytoplasm. Both branches are then combined to generate instance segmentations for cell and nucleus at the same time.
  • Figure 3: Measurements of morphological whole-cell features as obtained via different segmentation methods. The measurements have been obtained by first converting segmentation masks into GeoJSON files, importing them into QuPath and then using built-in functionalities to calculate shape and staining features.
  • Figure 4: Measurements of whole-cell staining features resulting from different segmentation methods. The above measurements have been obtained by first converting segmentation masks into GeoJSON files, importing them into QuPath and then using built-in functionalities to calculate shape and staining features.
  • Figure 5: Sample predictions on hematoxylin-eosin stained images of head and neck squamous cell carcinoma. All images are part of the test dataset. Note: Only tumor cells were annotated. Other cells (lymphocytes, macrophages, fibrocytes, etc.) were not included