VLM-based Prompts as the Optimal Assistant for Unpaired Histopathology Virtual Staining

Zizhi Chen; Xinyu Zhang; Minghao Han; Yizhou Liu; Ziyun Qian; Weifeng Zhang; Xukun Zhang; Jingwei Wei; Lihua Zhang

VLM-based Prompts as the Optimal Assistant for Unpaired Histopathology Virtual Staining

Zizhi Chen, Xinyu Zhang, Minghao Han, Yizhou Liu, Ziyun Qian, Weifeng Zhang, Xukun Zhang, Jingwei Wei, Lihua Zhang

TL;DR

This work addresses the challenge of virtually staining histopathology while preserving cytological structure and accounting for staining physics. It introduces a pathology-aware VLM-assisted framework (VPGAN) guided by three prompt modules—contrastive prompts, constant concept anchoring, and independent concept reinforcement—plus an inference-enhancement system (HARBOR) that uses DDIM and multi-level calibration to prevent staining-domain collapse. By leveraging a pathology-specific vision-language model, the authors achieve state-of-the-art realism and improved downstream glomerular detection and segmentation on unpaired staining datasets, with ablations confirming the value of each module. The approach supports data augmentation and holds practical potential for reducing staining costs and improving diagnostic workflows in pathology.

Abstract

In histopathology, tissue sections are typically stained using common H&E staining or special stains (MAS, PAS, PASM, etc.) to clearly visualize specific tissue structures. The rapid advancement of deep learning offers an effective solution for generating virtually stained images, significantly reducing the time and labor costs associated with traditional histochemical staining. However, a new challenge arises in separating the fundamental visual characteristics of tissue sections from the visual differences induced by staining agents. Additionally, virtual staining often overlooks essential pathological knowledge and the physical properties of staining, resulting in only style-level transfer. To address these issues, we introduce, for the first time in virtual staining tasks, a pathological vision-language large model (VLM) as an auxiliary tool. We integrate contrastive learnable prompts, foundational concept anchors for tissue sections, and staining-specific concept anchors to leverage the extensive knowledge of the pathological VLM. This approach is designed to describe, frame, and enhance the direction of virtual staining. Furthermore, we have developed a data augmentation method based on the constraints of the VLM. This method utilizes the VLM's powerful image interpretation capabilities to further integrate image style and structural information, proving beneficial in high-precision pathological diagnostics. Extensive evaluations on publicly available multi-domain unpaired staining datasets demonstrate that our method can generate highly realistic images and enhance the accuracy of downstream tasks, such as glomerular detection and segmentation. Our code is available at: https://github.com/CZZZZZZZZZZZZZZZZZ/VPGAN-HARBOR

VLM-based Prompts as the Optimal Assistant for Unpaired Histopathology Virtual Staining

TL;DR

Abstract

VLM-based Prompts as the Optimal Assistant for Unpaired Histopathology Virtual Staining

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)