Table of Contents
Fetching ...

Scalable, Trustworthy Generative Model for Virtual Multi-Staining from H&E Whole Slide Images

Mehdi Ounissi, Ilias Sarbout, Jean-Pierre Hugot, Christine Martinez-Vinson, Dominique Berrebi, Daniel Racoceanu

TL;DR

This work addresses the need for scalable, trustworthy virtual staining of H&E whole-slide images by introducing a unified H&E encoder that feeds multiple stain decoders (up to $S=8$). It combines annotation-free, knowledge-guided training with specialized losses ($\mathcal{L}_{IHC}$, $\mathcal{L}_{H\&E}$, $\mathcal{L}_{fwd}$, $\mathcal{L}_{idt}$, $\mathcal{L}_{lat}$, $\mathcal{L}_{cyc}$) and regularization to improve paired/unpaired stain synthesis while mitigating artifacts. Trust is built via real-time self-inspection with discriminators and pixel-wise confidence heatmaps, alongside a robust QC framework that detects input/anomaly and outputs confidence maps for synthetic stains. A cloud-based deployment (Cytomine) enables browser-based virtual staining, with a new pediatric Crohn’s dataset (480 WSIs across eight stains) to spur reproducible research. Empirically, the unified encoder approach achieves higher accuracy and efficiency than per-stain CycleGAN baselines, demonstrates context-dependent improvements at larger contextual magnifications in unpaired settings, and provides effective stitching artifact mitigation, advancing clinical adoption of digital pathology tools.

Abstract

Chemical staining methods are dependable but require extensive time, expensive chemicals, and raise environmental concerns. These challenges highlight the need for alternative solutions like virtual staining, which accelerates the diagnostic process and enhances stain application flexibility. Generative AI technologies are pivotal in addressing these issues. However, the high-stakes nature of healthcare decisions, especially in computational pathology, complicates the adoption of these tools due to their opaque processes. Our work introduces the use of generative AI for virtual staining, aiming to enhance performance, trustworthiness, scalability, and adaptability in computational pathology. The methodology centers on a singular H&E encoder supporting multiple stain decoders. This design focuses on critical regions in the latent space of H&E, enabling precise synthetic stain generation. Our method, tested to generate 8 different stains from a single H&E slide, offers scalability by loading only necessary model components during production. We integrate label-free knowledge in training, using loss functions and regularization to minimize artifacts, thus improving paired/unpaired virtual staining accuracy. To build trust, we use real-time self-inspection with discriminators for each stain type, providing pathologists with confidence heat-maps. Automatic quality checks on new H&E slides ensure conformity to the trained distribution, ensuring accurate synthetic stains. Recognizing pathologists' challenges with new technologies, we have developed an open-source, cloud-based system, that allows easy virtual staining of H&E slides through a browser, addressing hardware/software issues and facilitating real-time user feedback. We also curated a novel dataset of 8 paired H&E/stains related to pediatric Crohn's disease, comprising 480 WSIs to further stimulate computational pathology research.

Scalable, Trustworthy Generative Model for Virtual Multi-Staining from H&E Whole Slide Images

TL;DR

This work addresses the need for scalable, trustworthy virtual staining of H&E whole-slide images by introducing a unified H&E encoder that feeds multiple stain decoders (up to ). It combines annotation-free, knowledge-guided training with specialized losses (, , , , , ) and regularization to improve paired/unpaired stain synthesis while mitigating artifacts. Trust is built via real-time self-inspection with discriminators and pixel-wise confidence heatmaps, alongside a robust QC framework that detects input/anomaly and outputs confidence maps for synthetic stains. A cloud-based deployment (Cytomine) enables browser-based virtual staining, with a new pediatric Crohn’s dataset (480 WSIs across eight stains) to spur reproducible research. Empirically, the unified encoder approach achieves higher accuracy and efficiency than per-stain CycleGAN baselines, demonstrates context-dependent improvements at larger contextual magnifications in unpaired settings, and provides effective stitching artifact mitigation, advancing clinical adoption of digital pathology tools.

Abstract

Chemical staining methods are dependable but require extensive time, expensive chemicals, and raise environmental concerns. These challenges highlight the need for alternative solutions like virtual staining, which accelerates the diagnostic process and enhances stain application flexibility. Generative AI technologies are pivotal in addressing these issues. However, the high-stakes nature of healthcare decisions, especially in computational pathology, complicates the adoption of these tools due to their opaque processes. Our work introduces the use of generative AI for virtual staining, aiming to enhance performance, trustworthiness, scalability, and adaptability in computational pathology. The methodology centers on a singular H&E encoder supporting multiple stain decoders. This design focuses on critical regions in the latent space of H&E, enabling precise synthetic stain generation. Our method, tested to generate 8 different stains from a single H&E slide, offers scalability by loading only necessary model components during production. We integrate label-free knowledge in training, using loss functions and regularization to minimize artifacts, thus improving paired/unpaired virtual staining accuracy. To build trust, we use real-time self-inspection with discriminators for each stain type, providing pathologists with confidence heat-maps. Automatic quality checks on new H&E slides ensure conformity to the trained distribution, ensuring accurate synthetic stains. Recognizing pathologists' challenges with new technologies, we have developed an open-source, cloud-based system, that allows easy virtual staining of H&E slides through a browser, addressing hardware/software issues and facilitating real-time user feedback. We also curated a novel dataset of 8 paired H&E/stains related to pediatric Crohn's disease, comprising 480 WSIs to further stimulate computational pathology research.
Paper Structure (32 sections, 20 equations, 16 figures, 6 tables)

This paper contains 32 sections, 20 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Visual-XAI-enhanced trustworthy virtual staining approach. End-to-end virtual staining approach generating synthetic IHC stains by using a single H&E encoder and multiple stain decoders. Quality check (QC) protocol based on self-inspection features uses trained discriminators to consolidate trust in the generated synthetic stains, by ensuring the alignment of the new H&E slides with the trained distribution and by validating the quality of the generated stained slides. Integration of cloud-based computing enhances accessibility and adoption by enabling pathologists to efficiently process large datasets from anywhere, while end-to-end system's algorithms are handled in a back-end containerized environment.
  • Figure 2: Comprehensive representation of the training process for paired stain synthesis and computation of loss functions H&E $\leftrightarrow$ stain $i$.A. Details the first training cycle, starting with a paired real H&E image $X_{H\&E}$ and generating a corresponding stain $i$ image $\hat{Y}_i$, followed by the reconstruction of the original H&E image $\hat{X}_{H\&E}$ to facilitate computation of the loss function components detailed in Section \ref{['sec:arch_DL']}. B. Maps the second training cycle, beginning with a paired real stain $i$ image $X_i$, producing a corresponding H&E image $\hat{Y}_{H\&E}$, and concluding with the reconstructed stain $i$ image $\hat{X}_i$, using the staining mask $M_i$ ($\bar{M}_i$ corresponds to the complementary mask of $M_i$) to compute various elements of the loss function detailed in Section \ref{['sec:arch_DL']}. Each panel illustrates the model’s modifications aimed at enhancing the precision and consistency of stain synthesis and discrimination in paired training scenarios.
  • Figure 3: Detailed representation of the scalable training process for unpaired stain synthesis and computation of loss functions H&E $\leftrightarrow$ stain $i$.A. Illustrates the first training cycle, beginning with a real H&E image $X_{H\&E}$, generating a synthetic stain $i$ image $\hat{Y}_i$, and closing with the reconstructed H&E image $\hat{X}_{H\&E}$ to enable computation of the loss function components. B. Demonstrates the second training cycle, starting with a real stain $i$ image $X_i$, producing a synthetic H&E image $\hat{Y}_{H\&E}$, and concluding with the reconstructed stain $i$ image $\hat{X}_i$, incorporating the staining mask $M_i$ ($\bar{M}_i$ corresponds to the complementary mask of $M_i$) to compute various elements of the loss function (refer to Section \ref{['sec:arch_DL']}). Each panel highlights different aspects of the model’s adaptations and refinements, targeting and enhancing underrepresented activated regions to ensure more accurate and consistent stain synthesis and discrimination.
  • Figure 4: Multi-virtual staining results in the context of Crohn's disease. This figure showcases the high-resolution WSIs of various synthetic stains achieved using the $\mathcal{L}_{\text{IHC}}$ and $\mathcal{L}_{\text{H\&E}}$ loss functions in the unpaired setting.
  • Figure 5: Illustration of post-processing effects on stitching artifacts and performance metrics in virtual staining.(a) Depicts the improved outcomes using different overlap approaches with a Hamming window, emphasizing enhanced image quality and reduced artifacts, with the optimal performance-time execution ratio achieved at 60% overlap. (b) Shows typical stitching artifacts at tile borders with 0%, 30%, and 60% overlaps, marked by red arrows, demonstrating the sudden color changes and errors near the boundaries. This figure highlights the comparison across performance metrics (MSE, PSNR, SSIM) in both paired and unpaired settings, showcasing the effectiveness of the post-processing strategy in enhancing the overall quality and facilitating the adoption of virtual staining technologies in clinical environments. For reproducibility details, refer to Section \ref{['method:tile_stitch']}
  • ...and 11 more figures