Table of Contents
Fetching ...

A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images

Nikita Shvetsov, Thomas K. Kilvaer, Masoud Tafavvoghi, Anders Sildnes, Kajsa Møllersen, Lill-Tove Rasmussen Busund, Lars Ailo Bongo

TL;DR

The paper tackles the bottlenecks of cell-level analysis in digital pathology by refining annotations via cross-relabeling, leveraging a fixed H-Optimus foundation-model encoder for robust segmentation/classification, and distilling knowledge into a lightweight model. This compact model is then integrated into the QuPath platform to fit real-world clinical workflows. Quantitatively, the foundation-model approach improves average R^2 from 0.575 to 0.871 and PQ from 0.450 to 0.492 compared with a CNN baseline, while achieving a 48× reduction in parameter count. The work demonstrates practical gains in accuracy and efficiency with strong emphasis on deployment within existing pathology tools, though it acknowledges the need for extensive external validation and future enhancements, such as multi-modal data and deeper workflow integration.

Abstract

Developing clinically useful cell-level analysis tools in digital pathology remains challenging due to limitations in dataset granularity, inconsistent annotations, high computational demands, and difficulties integrating new technologies into workflows. To address these issues, we propose a solution that enhances data quality, model performance, and usability by creating a lightweight, extensible cell segmentation and classification model. First, we update data labels through cross-relabeling to refine annotations of PanNuke and MoNuSAC, producing a unified dataset with seven distinct cell types. Second, we leverage the H-Optimus foundation model as a fixed encoder to improve feature representation for simultaneous segmentation and classification tasks. Third, to address foundation models' computational demands, we distill knowledge to reduce model size and complexity while maintaining comparable performance. Finally, we integrate the distilled model into QuPath, a widely used open-source digital pathology platform. Results demonstrate improved segmentation and classification performance using the H-Optimus-based model compared to a CNN-based model. Specifically, average $R^2$ improved from 0.575 to 0.871, and average $PQ$ score improved from 0.450 to 0.492, indicating better alignment with actual cell counts and enhanced segmentation quality. The distilled model maintains comparable performance while reducing parameter count by a factor of 48. By reducing computational complexity and integrating into workflows, this approach may significantly impact diagnostics, reduce pathologist workload, and improve outcomes. Although the method shows promise, extensive validation is necessary prior to clinical deployment.

A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images

TL;DR

The paper tackles the bottlenecks of cell-level analysis in digital pathology by refining annotations via cross-relabeling, leveraging a fixed H-Optimus foundation-model encoder for robust segmentation/classification, and distilling knowledge into a lightweight model. This compact model is then integrated into the QuPath platform to fit real-world clinical workflows. Quantitatively, the foundation-model approach improves average R^2 from 0.575 to 0.871 and PQ from 0.450 to 0.492 compared with a CNN baseline, while achieving a 48× reduction in parameter count. The work demonstrates practical gains in accuracy and efficiency with strong emphasis on deployment within existing pathology tools, though it acknowledges the need for extensive external validation and future enhancements, such as multi-modal data and deeper workflow integration.

Abstract

Developing clinically useful cell-level analysis tools in digital pathology remains challenging due to limitations in dataset granularity, inconsistent annotations, high computational demands, and difficulties integrating new technologies into workflows. To address these issues, we propose a solution that enhances data quality, model performance, and usability by creating a lightweight, extensible cell segmentation and classification model. First, we update data labels through cross-relabeling to refine annotations of PanNuke and MoNuSAC, producing a unified dataset with seven distinct cell types. Second, we leverage the H-Optimus foundation model as a fixed encoder to improve feature representation for simultaneous segmentation and classification tasks. Third, to address foundation models' computational demands, we distill knowledge to reduce model size and complexity while maintaining comparable performance. Finally, we integrate the distilled model into QuPath, a widely used open-source digital pathology platform. Results demonstrate improved segmentation and classification performance using the H-Optimus-based model compared to a CNN-based model. Specifically, average improved from 0.575 to 0.871, and average score improved from 0.450 to 0.492, indicating better alignment with actual cell counts and enhanced segmentation quality. The distilled model maintains comparable performance while reducing parameter count by a factor of 48. By reducing computational complexity and integrating into workflows, this approach may significantly impact diagnostics, reduce pathologist workload, and improve outcomes. Although the method shows promise, extensive validation is necessary prior to clinical deployment.

Paper Structure

This paper contains 16 sections, 4 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Overview of the proposed solution, including 1) Data refinement using cross-relabeling, 2) Teacher model development and fine tuning, 3) Student model optimization with knowledge distillation and 4) Student model and QuPath integration
  • Figure 2: Refined dataset generation via cross relabeling
  • Figure 3: Cell instances preprocessing including (1) cell map extraction, (2) bounding box delineation, (3) adjusting cell boxes and (4) cropping and resizing of cell images
  • Figure 4: Cell relabeling procedure for epithelial and inflammatory cell classes
  • Figure 5: UNETR-like model with foundational model as backbone
  • ...and 6 more figures