Scalable, Trustworthy Generative Model for Virtual Multi-Staining from H&E Whole Slide Images
Mehdi Ounissi, Ilias Sarbout, Jean-Pierre Hugot, Christine Martinez-Vinson, Dominique Berrebi, Daniel Racoceanu
TL;DR
This work addresses the need for scalable, trustworthy virtual staining of H&E whole-slide images by introducing a unified H&E encoder that feeds multiple stain decoders (up to $S=8$). It combines annotation-free, knowledge-guided training with specialized losses ($\mathcal{L}_{IHC}$, $\mathcal{L}_{H\&E}$, $\mathcal{L}_{fwd}$, $\mathcal{L}_{idt}$, $\mathcal{L}_{lat}$, $\mathcal{L}_{cyc}$) and regularization to improve paired/unpaired stain synthesis while mitigating artifacts. Trust is built via real-time self-inspection with discriminators and pixel-wise confidence heatmaps, alongside a robust QC framework that detects input/anomaly and outputs confidence maps for synthetic stains. A cloud-based deployment (Cytomine) enables browser-based virtual staining, with a new pediatric Crohn’s dataset (480 WSIs across eight stains) to spur reproducible research. Empirically, the unified encoder approach achieves higher accuracy and efficiency than per-stain CycleGAN baselines, demonstrates context-dependent improvements at larger contextual magnifications in unpaired settings, and provides effective stitching artifact mitigation, advancing clinical adoption of digital pathology tools.
Abstract
Chemical staining methods are dependable but require extensive time, expensive chemicals, and raise environmental concerns. These challenges highlight the need for alternative solutions like virtual staining, which accelerates the diagnostic process and enhances stain application flexibility. Generative AI technologies are pivotal in addressing these issues. However, the high-stakes nature of healthcare decisions, especially in computational pathology, complicates the adoption of these tools due to their opaque processes. Our work introduces the use of generative AI for virtual staining, aiming to enhance performance, trustworthiness, scalability, and adaptability in computational pathology. The methodology centers on a singular H&E encoder supporting multiple stain decoders. This design focuses on critical regions in the latent space of H&E, enabling precise synthetic stain generation. Our method, tested to generate 8 different stains from a single H&E slide, offers scalability by loading only necessary model components during production. We integrate label-free knowledge in training, using loss functions and regularization to minimize artifacts, thus improving paired/unpaired virtual staining accuracy. To build trust, we use real-time self-inspection with discriminators for each stain type, providing pathologists with confidence heat-maps. Automatic quality checks on new H&E slides ensure conformity to the trained distribution, ensuring accurate synthetic stains. Recognizing pathologists' challenges with new technologies, we have developed an open-source, cloud-based system, that allows easy virtual staining of H&E slides through a browser, addressing hardware/software issues and facilitating real-time user feedback. We also curated a novel dataset of 8 paired H&E/stains related to pediatric Crohn's disease, comprising 480 WSIs to further stimulate computational pathology research.
