Table of Contents
Fetching ...

Enhancing Weakly-Supervised Histopathology Image Segmentation with Knowledge Distillation on MIL-Based Pseudo-Labels

Yinsheng He, Xingyu Li, Roger J. Zemp

TL;DR

This work tackles weakly supervised histopathology image segmentation by exploiting MIL-based pseudo-masks while preventing noise-driven degradation. It introduces an iterative fusion-knowledge distillation (IFKD) framework: stage-one MIL generates pseudo-masks; stage-two freezes a teacher and trains a student through fusion-based distillation, reinforced by a weighted cross-entropy regularizer and dynamic teacher–student switches to iteratively refine predictions. The method formalizes $f_w^t= extstyle\sum_i w_i f_i^t$ and $L^t$, and optimizes $L_{kd}$ with $L_{wce}$ to balance high-level guidance with pixel-level precision, achieving state-of-the-art results on Camelyon16 and DigestPath2019 across multiple MIL backbones. The findings demonstrate that MIL outputs can serve as effective pseudo-labels when guided by robust distillation strategies, reducing annotation costs while delivering high segmentation accuracy for clinical histopathology applications.

Abstract

Segmenting tumors in histological images is vital for cancer diagnosis. While fully supervised models excel with pixel-level annotations, creating such annotations is labor-intensive and costly. Accurate histopathology image segmentation under weakly-supervised conditions with coarse-grained image labels is still a challenging problem. Although multiple instance learning (MIL) has shown promise in segmentation tasks, surprisingly, no previous pseudo-supervision methods have used MIL-based outputs as pseudo-masks for training. We suspect this stems from concerns over noises in MIL results affecting pseudo supervision quality. To explore the potential of leveraging MIL-based segmentation for pseudo supervision, we propose a novel distillation framework for histopathology image segmentation. This framework introduces a iterative fusion-knowledge distillation strategy, enabling the student model to learn directly from the teacher's comprehensive outcomes. Through dynamic role reversal between the fixed teacher and learnable student models and the incorporation of weighted cross-entropy loss for model optimization, our approach prevents performance deterioration and noise amplification during knowledge distillation. Experimental results on public histopathology datasets, Camelyon16 and Digestpath2019, demonstrate that our approach not only complements various MIL-based segmentation methods but also significantly enhances their performance. Additionally, our method achieves new SOTA in the field.

Enhancing Weakly-Supervised Histopathology Image Segmentation with Knowledge Distillation on MIL-Based Pseudo-Labels

TL;DR

This work tackles weakly supervised histopathology image segmentation by exploiting MIL-based pseudo-masks while preventing noise-driven degradation. It introduces an iterative fusion-knowledge distillation (IFKD) framework: stage-one MIL generates pseudo-masks; stage-two freezes a teacher and trains a student through fusion-based distillation, reinforced by a weighted cross-entropy regularizer and dynamic teacher–student switches to iteratively refine predictions. The method formalizes and , and optimizes with to balance high-level guidance with pixel-level precision, achieving state-of-the-art results on Camelyon16 and DigestPath2019 across multiple MIL backbones. The findings demonstrate that MIL outputs can serve as effective pseudo-labels when guided by robust distillation strategies, reducing annotation costs while delivering high segmentation accuracy for clinical histopathology applications.

Abstract

Segmenting tumors in histological images is vital for cancer diagnosis. While fully supervised models excel with pixel-level annotations, creating such annotations is labor-intensive and costly. Accurate histopathology image segmentation under weakly-supervised conditions with coarse-grained image labels is still a challenging problem. Although multiple instance learning (MIL) has shown promise in segmentation tasks, surprisingly, no previous pseudo-supervision methods have used MIL-based outputs as pseudo-masks for training. We suspect this stems from concerns over noises in MIL results affecting pseudo supervision quality. To explore the potential of leveraging MIL-based segmentation for pseudo supervision, we propose a novel distillation framework for histopathology image segmentation. This framework introduces a iterative fusion-knowledge distillation strategy, enabling the student model to learn directly from the teacher's comprehensive outcomes. Through dynamic role reversal between the fixed teacher and learnable student models and the incorporation of weighted cross-entropy loss for model optimization, our approach prevents performance deterioration and noise amplification during knowledge distillation. Experimental results on public histopathology datasets, Camelyon16 and Digestpath2019, demonstrate that our approach not only complements various MIL-based segmentation methods but also significantly enhances their performance. Additionally, our method achieves new SOTA in the field.
Paper Structure (18 sections, 4 equations, 8 figures, 4 tables)

This paper contains 18 sections, 4 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Pseudo-mask comparison between MIL method and CAM method. (a) original image, (b) ground truth mask for the image, (c) pseudo-mask generated by CAM method, (d) pseudo-mask generated by MIL method. The challenge of using MIL-based segmentation as pseudo masks lies in how to prevent the segmentation model from deteriorating over time under pseudo supervision.
  • Figure 2: Knowledge distillation structures, (a), (b) are common knowledge distillation methods, and (c) is our fusion knowledge distillation. The comparison between these methods is shown in Fig. \ref{['fig:Camelyon16_fusion']} and Fig. \ref{['fig:Digestpath2019_fusion']}.
  • Figure 3: An overview of our proposed weakly-supervised segmentation method. Our method consists of two basic stages. In the first stage, we train a simple teacher model through MIL with input image patches and patch labels. In the second stage, we use iterative knowledge distillation to achieve precise pixel-level segmentation. Note, in the knowledge distillation process, the teacher model is frozen before parameter switch.
  • Figure 4: Visualization of segmentation results on the Camelyon16. The cancer tissues are shown in blue, and normal tissues are shown in green. OAA jiang2019integral, OEEM li2022online, and PistoSeg fang2023weakly follow the CAM-based pseudo-supervision paradigm, and SA-MIL li2023weakly, Swin-MIL qian2022transformer, Resnet-MIL he2016deep, and our baseline VGG-MIL are MIL-based approaches. For reference, we also include the results by a fully-supervised approach CAC-Unet zhu2021multi.
  • Figure 5: Ablation study for fusion-knowledge distillation on Camelyon16. Each point represents the best result in the past 30 epochs, and the standard deviation is marked by the error bars. The dashed line is the best score that teacher model could achieve during MIL. Basic knowledge distillation (a) and (b) correspond to the knowledge distillation structures shown in Fig. \ref{['fig:skd']}.
  • ...and 3 more figures