Table of Contents
Fetching ...

Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency

Meilong Xu, Xiaoling Hu, Saumya Gupta, Shahira Abousamra, Chao Chen

TL;DR

TopoSemiSeg tackles semi-supervised histopathology segmentation where dense gland and nuclei arrangements lead to topological errors. It introduces a noise-aware topology loss within a teacher–student framework, decomposing persistence diagrams into signal and noise to enforce signal topology consistency via Wasserstein distance while removing noisy topology with a total-persistence loss. The method yields superior topology-wise metrics across CRAG, GlaS, and MoNuSeg under 10% and 20% labeled data, with competitive pixel-wise performance, and demonstrates robustness across hyper-parameters and backbones. By learning robust topological representations from unlabeled data, it enables more reliable morphological analysis in digital pathology and can integrate with existing semi-supervised frameworks. The approach highlights the practical value of incorporating differentiable topological constraints into semi-supervised segmentation for densely packed histopathology structures.

Abstract

In digital pathology, segmenting densely distributed objects like glands and nuclei is crucial for downstream analysis. Since detailed pixel-wise annotations are very time-consuming, we need semi-supervised segmentation methods that can learn from unlabeled images. Existing semi-supervised methods are often prone to topological errors, e.g., missing or incorrectly merged/separated glands or nuclei. To address this issue, we propose TopoSemiSeg, the first semi-supervised method that learns the topological representation from unlabeled histopathology images. The major challenge is for unlabeled images; we only have predictions carrying noisy topology. To this end, we introduce a noise-aware topological consistency loss to align the representations of a teacher and a student model. By decomposing the topology of the prediction into signal topology and noisy topology, we ensure that the models learn the true topological signals and become robust to noise. Extensive experiments on public histopathology image datasets show the superiority of our method, especially on topology-aware evaluation metrics. Code is available at https://github.com/Melon-Xu/TopoSemiSeg.

Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency

TL;DR

TopoSemiSeg tackles semi-supervised histopathology segmentation where dense gland and nuclei arrangements lead to topological errors. It introduces a noise-aware topology loss within a teacher–student framework, decomposing persistence diagrams into signal and noise to enforce signal topology consistency via Wasserstein distance while removing noisy topology with a total-persistence loss. The method yields superior topology-wise metrics across CRAG, GlaS, and MoNuSeg under 10% and 20% labeled data, with competitive pixel-wise performance, and demonstrates robustness across hyper-parameters and backbones. By learning robust topological representations from unlabeled data, it enables more reliable morphological analysis in digital pathology and can integrate with existing semi-supervised frameworks. The approach highlights the practical value of incorporating differentiable topological constraints into semi-supervised segmentation for densely packed histopathology structures.

Abstract

In digital pathology, segmenting densely distributed objects like glands and nuclei is crucial for downstream analysis. Since detailed pixel-wise annotations are very time-consuming, we need semi-supervised segmentation methods that can learn from unlabeled images. Existing semi-supervised methods are often prone to topological errors, e.g., missing or incorrectly merged/separated glands or nuclei. To address this issue, we propose TopoSemiSeg, the first semi-supervised method that learns the topological representation from unlabeled histopathology images. The major challenge is for unlabeled images; we only have predictions carrying noisy topology. To this end, we introduce a noise-aware topological consistency loss to align the representations of a teacher and a student model. By decomposing the topology of the prediction into signal topology and noisy topology, we ensure that the models learn the true topological signals and become robust to noise. Extensive experiments on public histopathology image datasets show the superiority of our method, especially on topology-aware evaluation metrics. Code is available at https://github.com/Melon-Xu/TopoSemiSeg.
Paper Structure (22 sections, 11 equations, 8 figures, 11 tables)

This paper contains 22 sections, 11 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Illustration of the significance of topological correctness in gland segmentation. (a) an input image. (b) ground truth GT. (c) the result of SoTA semi-supervised segmentation method zhou2023xnet devoid of any topological regularization. (d) our segmentation result. For the regions within boxes, the SoTA's result has errors that, while minor at the pixel level, significantly alter the semantic interpretation. The red boxes indicate prediction errors such as incorrectly merging adjacent glands, the blue box indicates false positive gland predictions, and the green boxes indicate the false negative holes in glands. These errors affect the pathologist's decision and analysis.
  • Figure 2: (a) A predicted likelihood map $f$, (b) the binary prediction, and (c) the corresponding persistence diagram $Dgm(f)$, which tends to be noisy. In (d), consider the filtration for different values of threshold $c$. Notice that there are three true, or signal, structures, denoted by colors red, green, and blue, which persist across the range of $c$. Hence the dots corresponding to these structures are located at the upper-left corner of $Dgm(f)$. The remaining colors denote several noisy structures which persist for a short range of $c$, and thus their dots appear closer to the diagonal. Note that we only show 0-dim persistent dots referring to connected components in $Dgm(f)$.
  • Figure 3: An overview of our method. (a) denotes the labeled workflow. The student model learns from labeled images via the supervised loss $\mathcal{L}^{S}$. (b) denotes the unlabeled workflow. The student model learns from unlabeled images using $\mathcal{L}^{U}$, which consists of pixel-wise consistency loss $\mathcal{L}^{U}_{\text{pixel}}$ and noise-aware topological consistency loss $\mathcal{L}^{U}_{\text{topo}}$. (c) shows the details of our proposed noise-aware topological consistency loss $\mathcal{L}^{U}_{\text{topo}}$, which encompasses our decomposition and optimal matching strategy, resulting in signal topology consistency loss $\mathcal{L}^{U}_{\text{topo-cons}}$ and noisy topology removal loss $\mathcal{L}^{U}_{\text{topo-rem}}$.
  • Figure 4: Inituition of our decomposition and matching strategy. (a) the raw image. (b) the ground truth, included for reference. (c) the student likelihood (lh). (d) the teacher likelihood. (e) decomposition of the persistence diagram of the student likelihood. The purple line demonstrates the decomposition. (f) decomposition of the persistence diagram (tPD) of teacher likelihood. (g) the consistency between the signal topology. Green arrows show the matching process. (h) the noisy topology removal process. (i) the matching process without decomposition.
  • Figure 5: Qualitative results on three histopathology image datasets using $20\%$ labeled data for training. Locations prone to topological errors are shown within red boxes. Row 1: CRAG, Row 2: GlaS, Rows 3 & 4: MoNuSeg. Zoom in for better views.
  • ...and 3 more figures

Theorems & Definitions (2)

  • definition thmcounterdefinition: Wasserstein distance between PDs cohen2010lipschitz
  • definition thmcounterdefinition: Total Persistence cohen2010lipschitz