Improving Generalization Capability of Deep Learning-Based Nuclei Instance Segmentation by Non-deterministic Train Time and Deterministic Test Time Stain Normalization

Amirreza Mahbod; Georg Dorffner; Isabella Ellinger; Ramona Woitek; Sepideh Hatamikia

Improving Generalization Capability of Deep Learning-Based Nuclei Instance Segmentation by Non-deterministic Train Time and Deterministic Test Time Stain Normalization

Amirreza Mahbod, Georg Dorffner, Isabella Ellinger, Ramona Woitek, Sepideh Hatamikia

TL;DR

The paper addresses the challenge of DL-based nuclei instance segmentation generalizing to unseen histopathology datasets due to domain shift. It introduces a hybrid approach that combines non-deterministic train-time stain normalization (via Macenko with multiple reference images), deterministic test-time stain normalization, morphological test-time augmentation, and model ensembling on top of a strong baseline (DDU-Net). Across seven external datasets, the method yields consistent improvements in Dice, AJI, and PQ—up to 4.9%, 5.4%, and 5.9% respectively—while revealing trade-offs in inference time when using test-time stain normalization. The work demonstrates a practical pathway to robust nuclei segmentation across diverse tissues, with potential applicability to other histopathology tasks and architectures, albeit with noted computational overhead and reliance on stain normalization procedures.

Abstract

With the advent of digital pathology and microscopic systems that can scan and save whole slide histological images automatically, there is a growing trend to use computerized methods to analyze acquired images. Among different histopathological image analysis tasks, nuclei instance segmentation plays a fundamental role in a wide range of clinical and research applications. While many semi- and fully-automatic computerized methods have been proposed for nuclei instance segmentation, deep learning (DL)-based approaches have been shown to deliver the best performances. However, the performance of such approaches usually degrades when tested on unseen datasets. In this work, we propose a novel method to improve the generalization capability of a DL-based automatic segmentation approach. Besides utilizing one of the state-of-the-art DL-based models as a baseline, our method incorporates non-deterministic train time and deterministic test time stain normalization, and ensembling to boost the segmentation performance. We trained the model with one single training set and evaluated its segmentation performance on seven test datasets. Our results show that the proposed method provides up to 4.9%, 5.4%, and 5.9% better average performance in segmenting nuclei based on Dice score, aggregated Jaccard index, and panoptic quality score, respectively, compared to the baseline segmentation model.

Improving Generalization Capability of Deep Learning-Based Nuclei Instance Segmentation by Non-deterministic Train Time and Deterministic Test Time Stain Normalization

TL;DR

Abstract

Paper Structure (12 sections, 1 equation, 7 figures, 10 tables)

This paper contains 12 sections, 1 equation, 7 figures, 10 tables.

Introduction
Materials and Methods
Datasets
Segmentation model
Non-deterministic stain normalization in training
Test time stain normalization
Evaluation
Experimental setup
Results and Discussion
Conclusion
Acknowledgements
Competing interests

Figures (7)

Figure 1: The generic workflow of the DDU-Net 10.3389/fmed.2022.978146 for nuclei instance segmentation.
Figure 2: Examples of a non-selected (first row) and a selected (second row) reference images.
Figure 3: Selected reference images from the MoNuSeg training data based on histogram analysis.
Figure 4: During non-deterministic stain normalization, the input training images were randomly passed either directly to the segmentation model or into the normalization pipeline, where they were normalized to one of seven reference images. The probability for each path in the normalization pipeline was chosen equally.
Figure 5: Proposed inference approach with deterministic test time stain normalization. The blue dashed boxes in each branch show the morphological test time augmentation (TTA). Trained model $n\in \{1,2,3,4,5\}$ represents the trained model for each fold of 5-fold cross-validation.
...and 2 more figures

Improving Generalization Capability of Deep Learning-Based Nuclei Instance Segmentation by Non-deterministic Train Time and Deterministic Test Time Stain Normalization

TL;DR

Abstract

Improving Generalization Capability of Deep Learning-Based Nuclei Instance Segmentation by Non-deterministic Train Time and Deterministic Test Time Stain Normalization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)