Table of Contents
Fetching ...

Resource Efficient Multi-stain Kidney Glomeruli Segmentation via Self-supervision

Zeeshan Nisar, Friedrich Feuerhake, Thomas Lampert

TL;DR

The study tackles domain shift across histopathology stains in kidney glomeruli segmentation under scarce annotations. It evaluates self-supervised pre-training methods—SimCLR, BYOL, and a histology-focused HR-CS-CO extension—to prepare representations for downstream segmentation with UNet (single-stain) and UDAGAN (multi-stain). Results show that fine-tuning from SSL features with as little as 5% labelled data yields substantial performance gains and near-parity with fully supervised baselines, often reducing labeling needs by up to 95%. The findings generalise to public datasets (HuBMAP KPIs) and are accompanied by public release of pretrained models and code, indicating strong practical impact for label-efficient histopathology segmentation.

Abstract

Semantic segmentation under domain shift remains a fundamental challenge in computer vision, particularly when labelled training data is scarce. This challenge is particularly exemplified in histopathology image analysis, where the same tissue structures must be segmented across images captured under different imaging conditions (stains), each representing a distinct visual domain. Traditional deep learning methods like UNet require extensive labels, which is both costly and time-consuming, particularly when dealing with multiple domains (or stains). To mitigate this, various unsupervised domain adaptation based methods such as UDAGAN have been proposed, which reduce the need for labels by requiring only one (source) stain to be labelled. Nonetheless, obtaining source stain labels can still be challenging. This article shows that through self-supervised pre-training -- including SimCLR, BYOL, and a novel approach, HR-CS-CO -- the performance of these segmentation methods (UNet, and UDAGAN) can be retained even with 95% fewer labels. Notably, with self-supervised pre-training and using only 5% labels, the performance drops are minimal: 5.9% for UNet and 6.2% for UDAGAN, averaged over all stains, compared to their respective fully supervised counterparts (without pre-training, using 100% labels). Furthermore, these findings are shown to generalise beyond their training distribution to public benchmark datasets. Implementations and pre-trained models are publicly available \href{https://github.com/zeeshannisar/resource-effecient-multi-stain-kidney-glomeruli-segmentation.git}{online}.

Resource Efficient Multi-stain Kidney Glomeruli Segmentation via Self-supervision

TL;DR

The study tackles domain shift across histopathology stains in kidney glomeruli segmentation under scarce annotations. It evaluates self-supervised pre-training methods—SimCLR, BYOL, and a histology-focused HR-CS-CO extension—to prepare representations for downstream segmentation with UNet (single-stain) and UDAGAN (multi-stain). Results show that fine-tuning from SSL features with as little as 5% labelled data yields substantial performance gains and near-parity with fully supervised baselines, often reducing labeling needs by up to 95%. The findings generalise to public datasets (HuBMAP KPIs) and are accompanied by public release of pretrained models and code, indicating strong practical impact for label-efficient histopathology segmentation.

Abstract

Semantic segmentation under domain shift remains a fundamental challenge in computer vision, particularly when labelled training data is scarce. This challenge is particularly exemplified in histopathology image analysis, where the same tissue structures must be segmented across images captured under different imaging conditions (stains), each representing a distinct visual domain. Traditional deep learning methods like UNet require extensive labels, which is both costly and time-consuming, particularly when dealing with multiple domains (or stains). To mitigate this, various unsupervised domain adaptation based methods such as UDAGAN have been proposed, which reduce the need for labels by requiring only one (source) stain to be labelled. Nonetheless, obtaining source stain labels can still be challenging. This article shows that through self-supervised pre-training -- including SimCLR, BYOL, and a novel approach, HR-CS-CO -- the performance of these segmentation methods (UNet, and UDAGAN) can be retained even with 95% fewer labels. Notably, with self-supervised pre-training and using only 5% labels, the performance drops are minimal: 5.9% for UNet and 6.2% for UDAGAN, averaged over all stains, compared to their respective fully supervised counterparts (without pre-training, using 100% labels). Furthermore, these findings are shown to generalise beyond their training distribution to public benchmark datasets. Implementations and pre-trained models are publicly available \href{https://github.com/zeeshannisar/resource-effecient-multi-stain-kidney-glomeruli-segmentation.git}{online}.

Paper Structure

This paper contains 29 sections, 5 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Different stains used in kidney pathology. Each image represents a glomerulus and each stain provides specific information about the structure of glomerulus.
  • Figure 2: Overview of the proposed HR-CS-CO architecture. In Step# 1, stain-separation is applied to separate the $H_{ch}$ and $R_{ch}$ from each each stain. In Step# 2, the cross-stain prediction is employed as a generative task, learning to predict $H_{ch}$ from $R_{ch}$ and $R_{ch}$ from $H_{ch}$. Lastly, in Step# 3, contrastive learning is used as discriminative task on the augmented views of $H_{ch}$ and $R_{ch}$ to learn the final representations. Here, the weights for $\phi$ and $\psi$ are initialised to those learnt during cross-stain prediction (i.e. Step #2), thereby combining the strength of generative and discriminative learning.
  • Figure 3: Visualisation of Haematoxylin ($H_{ch}$) and Residual ($H_{ch}$) channels extracted from each of the stains used in this study.
  • Figure 4: Stain-variation augmentation. From left to right: the process begins by decomposing an Image into its corresponding Haematoxylin ($H_{ch}$) and Residual ($R_{ch}$) channels. Subsequently, each channel undergoes individual modification using a random factor $\alpha$ and bias $\beta$. The modified version are represented as $H"_{ch}$, and $R"_{ch}$.
  • Figure 5: Workflow of our study. Step #1: Different SSL methods are applied to learn representations from a large unlabelled dataset. Step # 2: The learned representations are then refined by fine-tuning on a small labelled data for different histopathology related segmentation tasks.
  • ...and 7 more figures