VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains

Shikha Dubey; Yosep Chong; Beatrice Knudsen; Shireen Y. Elhabian

VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains

Shikha Dubey, Yosep Chong, Beatrice Knudsen, Shireen Y. Elhabian

TL;DR

The paper addresses the need to derive multiple IHC stains from a single H&E section to maximize tissue utility in pathology. It introduces VIMs, a text-conditioned latent diffusion framework that uses a CLIP-based text encoder, a pre-trained Encoder-UNet-Decoder, LoRA adapters, and adversarial training to produce multiplex IHC images from uniplex training data. The method optimizes a three-term objective $\mathcal{L}_{total} = \mathcal{L}_{rec} + w_{clip} \mathcal{L}_{clip} + w_{adv} \mathcal{L}_{adv}$, where $\mathcal{L}_{rec}$ combines $L_2$ and $\mathcal{L}_{lpips}$, and demonstrates strong performance on CDX2 and CK8/18, approaching Pix2Pix and outperforming the base diffusion model. Pathologist assessments and ablative analyses confirm that informative hybrid prompts and negative samples enhance performance, suggesting VIMs’ potential for scalable clinical deployment and extension to additional IHC markers.

Abstract

This paper introduces a Virtual Immunohistochemistry Multiplex staining (VIMs) model designed to generate multiple immunohistochemistry (IHC) stains from a single hematoxylin and eosin (H&E) stained tissue section. IHC stains are crucial in pathology practice for resolving complex diagnostic questions and guiding patient treatment decisions. While commercial laboratories offer a wide array of up to 400 different antibody-based IHC stains, small biopsies often lack sufficient tissue for multiple stains while preserving material for subsequent molecular testing. This highlights the need for virtual IHC staining. Notably, VIMs is the first model to address this need, leveraging a large vision-language single-step diffusion model for virtual IHC multiplexing through text prompts for each IHC marker. VIMs is trained on uniplex paired H&E and IHC images, employing an adversarial training module. Testing of VIMs includes both paired and unpaired image sets. To enhance computational efficiency, VIMs utilizes a pre-trained large latent diffusion model fine-tuned with small, trainable weights through the Low-Rank Adapter (LoRA) approach. Experiments on nuclear and cytoplasmic IHC markers demonstrate that VIMs outperforms the base diffusion model and achieves performance comparable to Pix2Pix, a standard generative model for paired image translation. Multiple evaluation methods, including assessments by two pathologists, are used to determine the performance of VIMs. Additionally, experiments with different prompts highlight the impact of text conditioning. This paper represents the first attempt to accelerate histopathology research by demonstrating the generation of multiple IHC stains from a single H&E input using a single model trained solely on uniplex data.

VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains

TL;DR

, where

combines

and

, and demonstrates strong performance on CDX2 and CK8/18, approaching Pix2Pix and outperforming the base diffusion model. Pathologist assessments and ablative analyses confirm that informative hybrid prompts and negative samples enhance performance, suggesting VIMs’ potential for scalable clinical deployment and extension to additional IHC markers.

Abstract

Paper Structure (14 sections, 2 equations, 7 figures, 4 tables)

This paper contains 14 sections, 2 equations, 7 figures, 4 tables.

Introduction
Methods: Virtual Immunohistochemistry Multiplex staining (VIMs)
Latent Diffusion Model (LDM): Encoder-UNet-Decoder
Text/Prompt Encoder
LoRA Adapter and Skip Connections
Losses and Adversarial Training
Inference
Experimentation and Discussion
Dataset and Training Details
Evaluation Methods
Results
Ablation Study
Conclusion and Future Work
Supplementary

Figures (7)

Figure 1: VIMs: Proposed Multiplex IHC Staining Model. The pre-trained LDM, one of the Large Language Models (LLMs), is optimized for the virtual multiplex IHC generation task with a minimal number of trainable parameters.
Figure 2: Visualization of Multiplex IHC stain generation on the test set with the CK8/18 GT marker. VIMs generates visually realistic images, accurately highlighting both markers, and performs well across various cases, including difficult samples like the 2nd H&E input example.
Figure 3: Qualitative results for Multiplex IHC stain generation on the test dataset with GT for the CDX2 marker.(a) Comparison of VIMs-generated IHC images with Pix2Pix pix2pix, LDM latentsapcemodel, and ControlNet Controlnet. ControlNet inference step is 25, while others are 1 step. The proposed VIMs model generates visually realistic and accurately highlighted images for both CDX2 and CK8/18 markers, performing well across various case types, including negative samples such as the 2nd H&E input example.(b) Impact of conditioned prompt types and sample types on VIMs performance.
Figure 4: Overview of various prompts validated by pathologists, used in model training and evaluation to assess their impact on VIMs.
Figure 5: Evaluation methodologies for paired and unpaired test sets, including pathologist assessments. For paired test datasets, methods 1, 2, and 3 are employed, while methods 1 and 2 are used for unpaired test datasets. M1 to M4 represent Models 1 to 4, VS: Virtual Stainer.
...and 2 more figures

VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains

TL;DR

Abstract

VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains

Authors

TL;DR

Abstract

Table of Contents

Figures (7)