Table of Contents
Fetching ...

Assessment of Cell Nuclei AI Foundation Models in Kidney Pathology

Junlin Guo, Siqi Lu, Can Cui, Ruining Deng, Tianyuan Yao, Zhewen Tao, Yizhe Lin, Marilyn Lionts, Quan Liu, Juming Xiong, Yu Wang, Shilin Zhao, Catie Chang, Mitchell Wilkes, Mengmeng Yin, Haichun Yang, Yuankai Huo

TL;DR

The study addresses the generalization of cell nuclei foundation models to kidney pathology by conducting a large-scale evaluation of three SOTA models (Cellpose, StarDist, CellViT) on 2,542 kidney WSIs. It employs a rating-based curation scheme and cross-model agreement analysis to examine prediction distributions and identify consensus failure patches, revealing that CellViT achieves the highest rate of good segmentations but substantial gaps remain. The findings highlight specific failure modes and demonstrate how ensemble and consensus approaches can guide the development of kidney-domain-specific foundation models with reduced annotation needs. Overall, the work provides a rigorous benchmark and practical insights for improving nuclei segmentation in diverse kidney tissues.

Abstract

Cell nuclei instance segmentation is a crucial task in digital kidney pathology. Traditional automatic segmentation methods often lack generalizability when applied to unseen datasets. Recently, the success of foundation models (FMs) has provided a more generalizable solution, potentially enabling the segmentation of any cell type. In this study, we perform a large-scale evaluation of three widely used state-of-the-art (SOTA) cell nuclei foundation models (Cellpose, StarDist, and CellViT). Specifically, we created a highly diverse evaluation dataset consisting of 2,542 kidney whole slide images (WSIs) collected from both human and rodent sources, encompassing various tissue types, sizes, and staining methods. To our knowledge, this is the largest-scale evaluation of its kind to date. Our quantitative analysis of the prediction distribution reveals a persistent performance gap in kidney pathology. Among the evaluated models, CellViT demonstrated superior performance in segmenting nuclei in kidney pathology. However, none of the foundation models are perfect; a performance gap remains in general nuclei segmentation for kidney pathology.

Assessment of Cell Nuclei AI Foundation Models in Kidney Pathology

TL;DR

The study addresses the generalization of cell nuclei foundation models to kidney pathology by conducting a large-scale evaluation of three SOTA models (Cellpose, StarDist, CellViT) on 2,542 kidney WSIs. It employs a rating-based curation scheme and cross-model agreement analysis to examine prediction distributions and identify consensus failure patches, revealing that CellViT achieves the highest rate of good segmentations but substantial gaps remain. The findings highlight specific failure modes and demonstrate how ensemble and consensus approaches can guide the development of kidney-domain-specific foundation models with reduced annotation needs. Overall, the work provides a rigorous benchmark and practical insights for improving nuclei segmentation in diverse kidney tissues.

Abstract

Cell nuclei instance segmentation is a crucial task in digital kidney pathology. Traditional automatic segmentation methods often lack generalizability when applied to unseen datasets. Recently, the success of foundation models (FMs) has provided a more generalizable solution, potentially enabling the segmentation of any cell type. In this study, we perform a large-scale evaluation of three widely used state-of-the-art (SOTA) cell nuclei foundation models (Cellpose, StarDist, and CellViT). Specifically, we created a highly diverse evaluation dataset consisting of 2,542 kidney whole slide images (WSIs) collected from both human and rodent sources, encompassing various tissue types, sizes, and staining methods. To our knowledge, this is the largest-scale evaluation of its kind to date. Our quantitative analysis of the prediction distribution reveals a persistent performance gap in kidney pathology. Among the evaluated models, CellViT demonstrated superior performance in segmenting nuclei in kidney pathology. However, none of the foundation models are perfect; a performance gap remains in general nuclei segmentation for kidney pathology.
Paper Structure (14 sections, 3 figures, 2 tables)

This paper contains 14 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overall framework. (1) The creation of a diverse, large-scale kidney nuclei dataset.(2) Nuclei instance segmentation with Cellpose cellposs, StarDist stardist, and CellViT cellvit. (3) Rating on Segmentation Performance. Meaningful images were then identified through model agreement analysis.
  • Figure 2: The distribution of cell nuclei foundation model predictions across our dataset. Each row provides the number of "good”, "medium”, and "bad” predictions made by the foundation model.
  • Figure 3: (a) shows the agreement percentages between each pair of foundation models used in this study. To further assess the cross-model performance, (b) shows the percentages of image patches where all three models agree, two models agree, or no models agree, for each prediction class ("good”, "medium”, "bad”). Finally, the right panel presents examples of Consensus “good” and Consensus “failure” image patches, along with their respective failure categories.