PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains
Shengyi Hua, Fang Yan, Tianle Shen, Lei Ma, Xiaofan Zhang
TL;DR
PathoDuet addresses the data-hungry bottleneck of histopathology foundation models by introducing a pretext-token SSL framework with two targeted tasks. The cross-scale positioning task enhances H&E understanding by linking local patches to global context, while cross-stain transferring leverages H&E structure to interpret IHC slides via AdaIN-based feature transfer. Through two-stage pretraining and extensive downstream evaluation on both H&E and IHC tasks, PathoDuet demonstrates improved performance and data efficiency relative to conventional ImageNet and domain-specific baselines, and shows competitive results against giant pathology models with far less data. The work highlights the practical impact of tailored SSL design for pathology and provides a reusable open-source pipeline for advancing histopathology foundation models.
Abstract
Large amounts of digitized histopathological data display a promising future for developing pathological foundation models via self-supervised learning methods. Foundation models pretrained with these methods serve as a good basis for downstream tasks. However, the gap between natural and histopathological images hinders the direct application of existing methods. In this work, we present PathoDuet, a series of pretrained models on histopathological images, and a new self-supervised learning framework in histopathology. The framework is featured by a newly-introduced pretext token and later task raisers to explicitly utilize certain relations between images, like multiple magnifications and multiple stains. Based on this, two pretext tasks, cross-scale positioning and cross-stain transferring, are designed to pretrain the model on Hematoxylin and Eosin (H&E) images and transfer the model to immunohistochemistry (IHC) images, respectively. To validate the efficacy of our models, we evaluate the performance over a wide variety of downstream tasks, including patch-level colorectal cancer subtyping and whole slide image (WSI)-level classification in H&E field, together with expression level prediction of IHC marker, tumor identification and slide-level qualitative analysis in IHC field. The experimental results show the superiority of our models over most tasks and the efficacy of proposed pretext tasks. The codes and models are available at https://github.com/openmedlab/PathoDuet.
