Table of Contents
Fetching ...

Atlas: A Novel Pathology Foundation Model by Mayo Clinic, Charité, and Aignostics

Maximilian Alber, Stephan Tietz, Jonas Dippel, Timo Milbich, Timothée Lesort, Panos Korfiatis, Moritz Krügener, Beatriz Perez Cancer, Neelay Shah, Alexander Möllers, Philipp Seegerer, Alexandra Carpen-Amarie, Kai Standvoss, Gabriel Dernbach, Edwin de Jong, Simon Schallenberg, Andreas Kunft, Helmut Hoffer von Ankershoffen, Gavin Schaeferle, Patrick Duffy, Matt Redlon, Philipp Jurmeister, David Horst, Lukas Ruff, Klaus-Robert Müller, Frederick Klauschen, Andrew Norgan

TL;DR

Atlas addresses generalization gaps in digital pathology AI by training a pathology foundation model on 1.2M WSIs from two institutions using RudolfV self-supervision. It integrates multi-stain and multi-magnification data within a ViT-H/14 backbone, enabling robust slide- and patch-level representations. Evaluated across 21 public benchmarks with linear probing and ABMIL heads, Atlas achieves an average 61.9% performance and dominates 11 of 21 tasks, highlighting improved generalization without the largest parameter count or dataset size. The work underscores the value of diverse, multi-scale pathology data and standardized evaluation to advance clinically robust foundation models.

Abstract

Recent advances in digital pathology have demonstrated the effectiveness of foundation models across diverse applications. In this report, we present Atlas, a novel vision foundation model based on the RudolfV approach. Our model was trained on a dataset comprising 1.2 million histopathology whole slide images, collected from two medical institutions: Mayo Clinic and Charité - Universtätsmedizin Berlin. Comprehensive evaluations show that Atlas achieves state-of-the-art performance across twenty-one public benchmark datasets, even though it is neither the largest model by parameter count nor by training dataset size.

Atlas: A Novel Pathology Foundation Model by Mayo Clinic, Charité, and Aignostics

TL;DR

Atlas addresses generalization gaps in digital pathology AI by training a pathology foundation model on 1.2M WSIs from two institutions using RudolfV self-supervision. It integrates multi-stain and multi-magnification data within a ViT-H/14 backbone, enabling robust slide- and patch-level representations. Evaluated across 21 public benchmarks with linear probing and ABMIL heads, Atlas achieves an average 61.9% performance and dominates 11 of 21 tasks, highlighting improved generalization without the largest parameter count or dataset size. The work underscores the value of diverse, multi-scale pathology data and standardized evaluation to advance clinically robust foundation models.

Abstract

Recent advances in digital pathology have demonstrated the effectiveness of foundation models across diverse applications. In this report, we present Atlas, a novel vision foundation model based on the RudolfV approach. Our model was trained on a dataset comprising 1.2 million histopathology whole slide images, collected from two medical institutions: Mayo Clinic and Charité - Universtätsmedizin Berlin. Comprehensive evaluations show that Atlas achieves state-of-the-art performance across twenty-one public benchmark datasets, even though it is neither the largest model by parameter count nor by training dataset size.
Paper Structure (25 sections, 2 figures, 5 tables)

This paper contains 25 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of average performance, training dataset size, and model size of different contenders. The average performance is detailed in Table \ref{['tab:MAX']}. H-Optimus-0 hoptimus0 and Prov-GigaPath xu_whole-slide_2024 are the models with the most parameters and Virchow2 zimmermann_virchow_2024 is the model trained on most slides. Our model Atlas exhibits the best average performance and an intermediate model and training dataset size.
  • Figure 2: (A) shows the key training dataset statistics. The dataset was derived from 1.2 million pathology slides from 490k cases. The dataset contains data from over 70 tissue/organ types, over 100 different staining types, and 7 scanner types. The data was sourced from Mayo Clinic and Charité - Universitätsmedizin Berlin. (B) shows the distribution of neoplastic vs. non-neoplastic diseases. (C) shows the distribution of the staining groups H&E, IHC, and special stains.