Table of Contents
Fetching ...

GANs vs. Diffusion Models for virtual staining with the HER2match dataset

Pascal Klöckner, José Teixeira, Diana Montezuma, Jaime S. Cardoso, Hugo M. Horlings, Sara P. Oliveira

TL;DR

Problem: public paired datasets for H&E–HER2 virtual staining are scarce. Approach: introduce the HER2match dataset, create BCI-clean, and benchmark six architectures (three GANs, three DMs) including the Brownian Bridge Diffusion Model, across datasets. Contributions: first publicly available same-slide H&E–HER2 tiles; first BBDM for H&E–HER2; cross-dataset evaluation showing GANs generally exceed DMs, with alignment quality significantly affecting results. Significance: provides a high-quality data resource and guidance for developing histology-faithful virtual staining methods, and underscores the need for tissue-specific perceptual metrics.

Abstract

Virtual staining is a promising technique that uses deep generative models to recreate histological stains, providing a faster and more cost-effective alternative to traditional tissue chemical staining. Specifically for H&E-HER2 staining transfer, despite a rising trend in publications, the lack of sufficient public datasets has hindered progress in the topic. Additionally, it is currently unclear which model frameworks perform best for this particular task. In this paper, we introduce the HER2match dataset, the first publicly available dataset with the same breast cancer tissue sections stained with both H&E and HER2. Furthermore, we compare the performance of several Generative Adversarial Networks (GANs) and Diffusion Models (DMs), and implement a novel Brownian Bridge Diffusion Model for H&E-HER2 translation. Our findings indicate that, overall, GANs perform better than DMs, with only the BBDM achieving comparable results. Furthermore, we emphasize the importance of data alignment, as all models trained on HER2match produced vastly improved visuals compared to the widely used consecutive-slide BCI dataset. This research provides a new high-quality dataset ([available upon publication acceptance]), improving both model training and evaluation. In addition, our comparison of frameworks offers valuable guidance for researchers working on the topic.

GANs vs. Diffusion Models for virtual staining with the HER2match dataset

TL;DR

Problem: public paired datasets for H&E–HER2 virtual staining are scarce. Approach: introduce the HER2match dataset, create BCI-clean, and benchmark six architectures (three GANs, three DMs) including the Brownian Bridge Diffusion Model, across datasets. Contributions: first publicly available same-slide H&E–HER2 tiles; first BBDM for H&E–HER2; cross-dataset evaluation showing GANs generally exceed DMs, with alignment quality significantly affecting results. Significance: provides a high-quality data resource and guidance for developing histology-faithful virtual staining methods, and underscores the need for tissue-specific perceptual metrics.

Abstract

Virtual staining is a promising technique that uses deep generative models to recreate histological stains, providing a faster and more cost-effective alternative to traditional tissue chemical staining. Specifically for H&E-HER2 staining transfer, despite a rising trend in publications, the lack of sufficient public datasets has hindered progress in the topic. Additionally, it is currently unclear which model frameworks perform best for this particular task. In this paper, we introduce the HER2match dataset, the first publicly available dataset with the same breast cancer tissue sections stained with both H&E and HER2. Furthermore, we compare the performance of several Generative Adversarial Networks (GANs) and Diffusion Models (DMs), and implement a novel Brownian Bridge Diffusion Model for H&E-HER2 translation. Our findings indicate that, overall, GANs perform better than DMs, with only the BBDM achieving comparable results. Furthermore, we emphasize the importance of data alignment, as all models trained on HER2match produced vastly improved visuals compared to the widely used consecutive-slide BCI dataset. This research provides a new high-quality dataset ([available upon publication acceptance]), improving both model training and evaluation. In addition, our comparison of frameworks offers valuable guidance for researchers working on the topic.

Paper Structure

This paper contains 14 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: H&E-to-IHC virtual staining frameworks: (A) Pyramid pix2pix, (B) ASP, (C) BCIstainer, (D) DDIB, (E) CM and (F) BBDM.
  • Figure 2: Examples from the BCI (A) and HER2match (B) datasets. Blue and red circles highlight two types of registration artifacts commonly present in BCI images, and yellow circles depict the level of alignment in the HER2match pairs.
  • Figure 3: (A) Distribution of SSIM, PSNR, and LPIPS on the test sets. (B) Heatmap of FID and KID. (C) Results of the Linear models, with * denoting p$\leq$0.001.
  • Figure 4: Image examples for models trained on BCI, BCI-clean, and HER2match.