Distilling foundation models for robust and efficient models in digital pathology
Alexandre Filiot, Nicolas Dop, Oussama Tchita, Auriane Riou, Rémy Dubois, Thomas Peeters, Daria Valter, Marin Scalbert, Charlie Saillard, Geneviève Robin, Antoine Olivier
TL;DR
This paper tackles the high computational cost and robustness limitations of large foundation models in digital pathology by distilling a billion-parameter FM into a lightweight ViT-based encoder (H0-mini). It leverages a dual loss framework combining DINO and iBOT objectives, trained as a teacher–student setup, to preserve performance on 43M tiles from TCGA while dramatically reducing inference costs. Across EVA, HEST, PLISM, and BreastBm benchmarks, H0-mini delivers competitive accuracy and superior robustness to staining and scanner variations compared with larger FMs. The work demonstrates that distillation can yield efficient, robust pathology models suitable for clinical deployment, and it provides public data and resources to encourage further research and adoption.
Abstract
In recent years, the advent of foundation models (FM) for digital pathology has relied heavily on scaling the pre-training datasets and the model size, yielding large and powerful models. While it resulted in improving the performance on diverse downstream tasks, it also introduced increased computational cost and inference time. In this work, we explore the distillation of a large foundation model into a smaller one, reducing the number of parameters by several orders of magnitude. Leveraging distillation techniques, our distilled model, H0-mini, achieves nearly comparable performance to large FMs at a significantly reduced inference cost. It is evaluated on several public benchmarks, achieving 3rd place on the HEST benchmark and 5th place on the EVA benchmark. Additionally, a robustness analysis conducted on the PLISM dataset demonstrates that our distilled model reaches excellent robustness to variations in staining and scanning conditions, significantly outperforming other state-of-the art models. This opens new perspectives to design lightweight and robust models for digital pathology, without compromising on performance.
