Table of Contents
Fetching ...

Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

TL;DR

The paper tackles inter-stain misalignment in whole-slide image representations by introducing CSCL, a two-stage pretraining framework that leverages a paired five-stain dataset (H&E, HER2, KI67, ER, PGR). It combines patch-level cross-stain alignment via a lightweight adapter (CPA) with cross-stain attention fusion (CAF) and a cross-stain global alignment (CGA) for MIL-based slide representations, yielding robust, transferable H&E embeddings. CSCL demonstrates consistent improvements across cancer subtype classification, IHC biomarker status, and survival prediction, underscoring the value of spatially aligned multi-stain data for representation learning. The authors also provide an aligned dataset and open-source code to promote reproducibility and broader adoption in translational pathology.

Abstract

Universal, transferable whole-slide image (WSI) representations are central to computational pathology. Incorporating multiple markers (e.g., immunohistochemistry, IHC) alongside H&E enriches H&E-based features with diverse, biologically meaningful information. However, progress is limited by the scarcity of well-aligned multi-stain datasets. Inter-stain misalignment shifts corresponding tissue across slides, hindering consistent patch-level features and degrading slide-level embeddings. To address this, we curated a slide-level aligned, five-stain dataset (H&E, HER2, KI67, ER, PGR) to enable paired H&E-IHC learning and robust cross-stain representation. Leveraging this dataset, we propose Cross-Stain Contrastive Learning (CSCL), a two-stage pretraining framework with a lightweight adapter trained using patch-wise contrastive alignment to improve the compatibility of H&E features with corresponding IHC-derived contextual cues, and slide-level representation learning with Multiple Instance Learning (MIL), which uses a cross-stain attention fusion module to integrate stain-specific patch features and a cross-stain global alignment module to enforce consistency among slide-level embeddings across different stains. Experiments on cancer subtype classification, IHC biomarker status classification, and survival prediction show consistent gains, yielding high-quality, transferable H&E slide-level representations. The code and data are available at https://github.com/lily-zyz/CSCL.

Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

TL;DR

The paper tackles inter-stain misalignment in whole-slide image representations by introducing CSCL, a two-stage pretraining framework that leverages a paired five-stain dataset (H&E, HER2, KI67, ER, PGR). It combines patch-level cross-stain alignment via a lightweight adapter (CPA) with cross-stain attention fusion (CAF) and a cross-stain global alignment (CGA) for MIL-based slide representations, yielding robust, transferable H&E embeddings. CSCL demonstrates consistent improvements across cancer subtype classification, IHC biomarker status, and survival prediction, underscoring the value of spatially aligned multi-stain data for representation learning. The authors also provide an aligned dataset and open-source code to promote reproducibility and broader adoption in translational pathology.

Abstract

Universal, transferable whole-slide image (WSI) representations are central to computational pathology. Incorporating multiple markers (e.g., immunohistochemistry, IHC) alongside H&E enriches H&E-based features with diverse, biologically meaningful information. However, progress is limited by the scarcity of well-aligned multi-stain datasets. Inter-stain misalignment shifts corresponding tissue across slides, hindering consistent patch-level features and degrading slide-level embeddings. To address this, we curated a slide-level aligned, five-stain dataset (H&E, HER2, KI67, ER, PGR) to enable paired H&E-IHC learning and robust cross-stain representation. Leveraging this dataset, we propose Cross-Stain Contrastive Learning (CSCL), a two-stage pretraining framework with a lightweight adapter trained using patch-wise contrastive alignment to improve the compatibility of H&E features with corresponding IHC-derived contextual cues, and slide-level representation learning with Multiple Instance Learning (MIL), which uses a cross-stain attention fusion module to integrate stain-specific patch features and a cross-stain global alignment module to enforce consistency among slide-level embeddings across different stains. Experiments on cancer subtype classification, IHC biomarker status classification, and survival prediction show consistent gains, yielding high-quality, transferable H&E slide-level representations. The code and data are available at https://github.com/lily-zyz/CSCL.

Paper Structure

This paper contains 15 sections, 6 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of CSCL.Preprocessing: Five types of stained WSIs are aligned, followed by tissue segmentation and 256$\times$256 patching. Patch Encoding: Patches are fed into a Vision Transformer (ViT); an adapter optimized by $\mathcal{L}_{\mathtt{CPA}}$ refines them to enforce consistent cross-stain patch-level alignment across different stains. Cross-Stain Attention Fusion: Multi-stain features are fused to enrich representations. Slide Encoding: Multi-stain features are fused to enrich H&E representations. Slide Encoding: For each stain, patch embeddings are aggregated by MIL into stain-specific slide embeddings; subsequently, $\mathcal{L}_{\mathtt{CGA}}$ aligns H&E with IHC, yielding consistent, informative, and stain-invariant slide features.
  • Figure 2: Visual registration results for exemplary cases