Table of Contents
Fetching ...

Standardized CycleGAN training for unsupervised stain adaptation in invasive carcinoma classification for breast histopathology

Nicolas Nerrienet, Rémy Peyret, Marie Sockeel, Stéphane Sockeel

TL;DR

This study tackles the generalization problem in breast histopathology caused by stain and scanner variability by employing unsupervised stain translation with CycleGANs. It compares three downstream strategies—MDS1, MDS2, and UDA—to a baseline classifier and introduces a systematic CycleGAN training-stop method using Fréchet Inception Distance (FID), along with an analysis of data requirements. Results show that all CycleGAN-based approaches improve performance without target-domain labels, with Unsupervised Domain Augmentation (UDA) providing the most consistent and robust cross-center generalization. The work highlights practical data-efficiency considerations and offers a production-oriented workflow for stain-invariant breast cancer classification, while noting limitations and avenues for future refinement with newer generative models and separate staining/scanning factors.

Abstract

Generalization is one of the main challenges of computational pathology. Slide preparation heterogeneity and the diversity of scanners lead to poor model performance when used on data from medical centers not seen during training. In order to achieve stain invariance in breast invasive carcinoma patch classification, we implement a stain translation strategy using cycleGANs for unsupervised image-to-image translation. We compare three cycleGAN-based approaches to a baseline classification model obtained without any stain invariance strategy. Two of the proposed approaches use cycleGAN's translations at inference or training in order to build stain-specific classification models. The last method uses them for stain data augmentation during training. This constrains the classification model to learn stain-invariant features. Baseline metrics are set by training and testing the baseline classification model on a reference stain. We assessed performances using three medical centers with H&E and H&E&S staining. Every approach tested in this study improves baseline metrics without needing labels on target stains. The stain augmentation-based approach produced the best results on every stain. Each method's pros and cons are studied and discussed in this paper. However, training highly performing cycleGANs models in itself represents a challenge. In this work, we introduce a systematical method for optimizing cycleGAN training by setting a novel stopping criterion. This method has the benefit of not requiring any visual inspection of cycleGAN results and proves superiority to methods using a predefined number of training epochs. In addition, we also study the minimal amount of data required for cycleGAN training.

Standardized CycleGAN training for unsupervised stain adaptation in invasive carcinoma classification for breast histopathology

TL;DR

This study tackles the generalization problem in breast histopathology caused by stain and scanner variability by employing unsupervised stain translation with CycleGANs. It compares three downstream strategies—MDS1, MDS2, and UDA—to a baseline classifier and introduces a systematic CycleGAN training-stop method using Fréchet Inception Distance (FID), along with an analysis of data requirements. Results show that all CycleGAN-based approaches improve performance without target-domain labels, with Unsupervised Domain Augmentation (UDA) providing the most consistent and robust cross-center generalization. The work highlights practical data-efficiency considerations and offers a production-oriented workflow for stain-invariant breast cancer classification, while noting limitations and avenues for future refinement with newer generative models and separate staining/scanning factors.

Abstract

Generalization is one of the main challenges of computational pathology. Slide preparation heterogeneity and the diversity of scanners lead to poor model performance when used on data from medical centers not seen during training. In order to achieve stain invariance in breast invasive carcinoma patch classification, we implement a stain translation strategy using cycleGANs for unsupervised image-to-image translation. We compare three cycleGAN-based approaches to a baseline classification model obtained without any stain invariance strategy. Two of the proposed approaches use cycleGAN's translations at inference or training in order to build stain-specific classification models. The last method uses them for stain data augmentation during training. This constrains the classification model to learn stain-invariant features. Baseline metrics are set by training and testing the baseline classification model on a reference stain. We assessed performances using three medical centers with H&E and H&E&S staining. Every approach tested in this study improves baseline metrics without needing labels on target stains. The stain augmentation-based approach produced the best results on every stain. Each method's pros and cons are studied and discussed in this paper. However, training highly performing cycleGANs models in itself represents a challenge. In this work, we introduce a systematical method for optimizing cycleGAN training by setting a novel stopping criterion. This method has the benefit of not requiring any visual inspection of cycleGAN results and proves superiority to methods using a predefined number of training epochs. In addition, we also study the minimal amount of data required for cycleGAN training.
Paper Structure (19 sections, 9 figures, 7 tables)

This paper contains 19 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Samples from various centers and their respective stains
  • Figure 2: CycleGAN architecture. Generator $A \longleftrightarrow B$ is responsible for taking a sample from a domain $A$ and translate it into domain $B$, while the discriminator $B$ classify the translated samples as fake or a true sample from the domain $B$. Translation unicity is guaranteed by minimizing the difference between the original sample and the translated sample reconstructed in domain $A$ by generator $B \longleftrightarrow A$. The same logic is applied for the other generator.
  • Figure 3: CycleGANs transformations examples from source to targets
  • Figure 4: Explains the Multi-Domain Supervised 1 (MDS1) process. \ref{['MDS1:training']} represents the training phase of MDS1 : the carcinoma classifier is trained on "real" source samples. \ref{['MDS1:inference']} represents the inference phase : the carcinoma classifier is used at inference time on "fake" target samples.
  • Figure 5: Explains the Multi-Domain Supervised 2 (MDS2) process. \ref{['MDS2:training']} represent the training phase of MDS2 : the carcinoma classifier is trained on "fake" target samples. \ref{['MDS2:inference']} represent the inference phase : the carcinoma classifier is used at inference time on "real" target samples.
  • ...and 4 more figures