Towards Large-Scale Training of Pathology Foundation Models
kaiko. ai, Nanne Aben, Edwin D. de Jong, Ioannis Gatopoulos, Nicolas Känzig, Mikhail Karasikov, Axel Lagré, Roman Moser, Joost van Doorn, Fei Tang
TL;DR
This work presents a scalable pipeline for large-scale pathology foundation models using Online Patching to dynamically sample patches from WSIs and a standardized evaluation framework (eva) for fair cross-model comparisons. Through experiments on TCGA with DINO and DINOv2, the authors show that pretraining on ImageNet speeds convergence, and that training with multiple magnifications improves robustness, with data diversity being crucial for out-of-distribution generalization. They introduce unsupervised metrics (RankMe and ODCorr) that correlate with downstream performance and release both the models and eva for broader adoption. Overall, the approach enables scalable, reproducible development and evaluation of pathology FMs across diverse downstream tasks.
Abstract
Driven by the recent advances in deep learning methods and, in particular, by the development of modern self-supervised learning algorithms, increased interest and efforts have been devoted to build foundation models (FMs) for medical images. In this work, we present our scalable training pipeline for large pathology imaging data, and a comprehensive analysis of various hyperparameter choices and training techniques for building pathology FMs. We release and make publicly available the first batch of our pathology FMs (https://github.com/kaiko-ai/towards_large_pathology_fms) trained on open-access TCGA whole slide images, a commonly used collection of pathology images. The experimental evaluation shows that our models reach state-of-the-art performance on various patch-level downstream tasks, ranging from breast cancer subtyping to colorectal nuclear segmentation. Finally, to unify the evaluation approaches used in the field and to simplify future comparisons of different FMs, we present an open-source framework (https://github.com/kaiko-ai/eva) designed for the consistent evaluation of pathology FMs across various downstream tasks.
