Table of Contents
Fetching ...

Multi-domain stain normalization for digital pathology: A cycle-consistent adversarial network for whole slide images

Martin J. Hetz, Tabea-Clara Bucher, Titus J. Brinker

TL;DR

Stain variability across medical centers causes domain shift that hinders deep pathology analysis. The authors introduce MultiStain-CycleGAN, a multi-domain stain normalization method based on CycleGAN that uses an intermediate domain to normalize unseen H&E stainings with one model. They show that the method preserves tumor-diagnostic information while substantially disguising tissue origin, achieving the highest SSIM and competitive FID among tested approaches. The approach improves robustness to domain shifts and offers privacy benefits by reducing center-specific signatures, with potential extensions to additional stain types and downstream tasks. Overall, this work advances practical, scalable stain normalization for digital pathology and supports more generalizable, privacy-conscious AI-assisted diagnostics.

Abstract

The variation in histologic staining between different medical centers is one of the most profound challenges in the field of computer-aided diagnosis. The appearance disparity of pathological whole slide images causes algorithms to become less reliable, which in turn impedes the wide-spread applicability of downstream tasks like cancer diagnosis. Furthermore, different stainings lead to biases in the training which in case of domain shifts negatively affect the test performance. Therefore, in this paper we propose MultiStain-CycleGAN, a multi-domain approach to stain normalization based on CycleGAN. Our modifications to CycleGAN allow us to normalize images of different origins without retraining or using different models. We perform an extensive evaluation of our method using various metrics and compare it to commonly used methods that are multi-domain capable. First, we evaluate how well our method fools a domain classifier that tries to assign a medical center to an image. Then, we test our normalization on the tumor classification performance of a downstream classifier. Furthermore, we evaluate the image quality of the normalized images using the Structural similarity index and the ability to reduce the domain shift using the Fréchet inception distance. We show that our method proves to be multi-domain capable, provides the highest image quality among the compared methods, and can most reliably fool the domain classifier while keeping the tumor classifier performance high. By reducing the domain influence, biases in the data can be removed on the one hand and the origin of the whole slide image can be disguised on the other, thus enhancing patient data privacy.

Multi-domain stain normalization for digital pathology: A cycle-consistent adversarial network for whole slide images

TL;DR

Stain variability across medical centers causes domain shift that hinders deep pathology analysis. The authors introduce MultiStain-CycleGAN, a multi-domain stain normalization method based on CycleGAN that uses an intermediate domain to normalize unseen H&E stainings with one model. They show that the method preserves tumor-diagnostic information while substantially disguising tissue origin, achieving the highest SSIM and competitive FID among tested approaches. The approach improves robustness to domain shifts and offers privacy benefits by reducing center-specific signatures, with potential extensions to additional stain types and downstream tasks. Overall, this work advances practical, scalable stain normalization for digital pathology and supports more generalizable, privacy-conscious AI-assisted diagnostics.

Abstract

The variation in histologic staining between different medical centers is one of the most profound challenges in the field of computer-aided diagnosis. The appearance disparity of pathological whole slide images causes algorithms to become less reliable, which in turn impedes the wide-spread applicability of downstream tasks like cancer diagnosis. Furthermore, different stainings lead to biases in the training which in case of domain shifts negatively affect the test performance. Therefore, in this paper we propose MultiStain-CycleGAN, a multi-domain approach to stain normalization based on CycleGAN. Our modifications to CycleGAN allow us to normalize images of different origins without retraining or using different models. We perform an extensive evaluation of our method using various metrics and compare it to commonly used methods that are multi-domain capable. First, we evaluate how well our method fools a domain classifier that tries to assign a medical center to an image. Then, we test our normalization on the tumor classification performance of a downstream classifier. Furthermore, we evaluate the image quality of the normalized images using the Structural similarity index and the ability to reduce the domain shift using the Fréchet inception distance. We show that our method proves to be multi-domain capable, provides the highest image quality among the compared methods, and can most reliably fool the domain classifier while keeping the tumor classifier performance high. By reducing the domain influence, biases in the data can be removed on the one hand and the origin of the whole slide image can be disguised on the other, thus enhancing patient data privacy.
Paper Structure (27 sections, 9 equations, 11 figures, 3 tables)

This paper contains 27 sections, 9 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Left: Stain normalization based on conventional GAN approaches. For each staining a separate model has to be trained to normalize many stainings to the target domain. Right: Stain normalization with MultiStain-CycleGAN, which is trained on one staining and can normalize any H&E staining of the same tissue type. Dotted arrows indicate the data needed for training the respective model, normal arrows show the inference path.
  • Figure 2: Example slides for the different domains from the CAMELYON17 dataset. The images a)-e) show examples of tissue sections of the different centers with their different stainings: a) CWZ; b) RST; c) UMCU; d) RUMC; e) LPON.
  • Figure 3: The principle of image-to-image translation with CycleGAN proposed by Zhu et al. An image from a domain $\mathcal{X}$ is mapped to a domain $\mathcal{Y}$ by a generative model. After the mapping the image will be reconstructed into its original domain and the cycle-consistency loss is computed, enabling unpaired image-to-image translation.
  • Figure 4: Overview of the MultiStain-CycleGAN. Images $x$ from a source domain will be mapped to an intermediate domain by a function $H$. $H$ consists of a color augmentation function and a grayscale conversion. The generator $G$ then transforms the gray image $w$ into the target domain. This process, including projecting $y'$ into the intermediate domain, is repeated for the normalized image $y'$ again, to reconstruct the original image. The second path has been omitted for clarity. Further, instead of feeding the network a real image from the respective domain for calculating the identity loss used by Zhu et al., a reconstruction task of unaugmented gray images $H'(x)$ is done. The intermediate domain allows to normalize any H&E stains without having to re-train the model.
  • Figure 5: The target domain representative template images from three different slides used for the template-based methods.
  • ...and 6 more figures