Table of Contents
Fetching ...

HistoEncoder: a digital pathology foundation model for prostate cancer

Joona Pohjonen, Abderrahim-Oussama Batouche, Antti Rannikko, Kevin Sandeman, Andrew Erickson, Esa Pitkanen, Tuomas Mirtti

TL;DR

A foundation model for prostate cancer digital pathology called HistoEncoder is developed by pre-training on 48 million prostate tissue tile images and it is demonstrated that HistoEncoder features extracted from tile images with similar histological patterns map closely together in the feature space.

Abstract

Foundation models are trained on massive amounts of data to distinguish complex patterns and can be adapted to a wide range of downstream tasks with minimal computational resources. Here, we develop a foundation model for prostate cancer digital pathology called HistoEncoder by pre-training on 48 million prostate tissue tile images. We demonstrate that HistoEncoder features extracted from tile images with similar histological patterns map closely together in the feature space. HistoEncoder outperforms models pre-trained with natural images, even without fine-tuning or with 1000 times less training data. We describe two use cases that leverage the capabilities of HistoEncoder by fine-tuning the model with a limited amount of data and computational resources. First, we show how HistoEncoder can be used to automatically annotate large-scale datasets with high accuracy. Second, we combine histomics with commonly used clinical nomograms, significantly improving prostate cancer-specific death survival models. Foundation models such as HistoEncoder can allow organizations with limited resources to build effective clinical software tools without needing extensive datasets or significant amounts of computing.

HistoEncoder: a digital pathology foundation model for prostate cancer

TL;DR

A foundation model for prostate cancer digital pathology called HistoEncoder is developed by pre-training on 48 million prostate tissue tile images and it is demonstrated that HistoEncoder features extracted from tile images with similar histological patterns map closely together in the feature space.

Abstract

Foundation models are trained on massive amounts of data to distinguish complex patterns and can be adapted to a wide range of downstream tasks with minimal computational resources. Here, we develop a foundation model for prostate cancer digital pathology called HistoEncoder by pre-training on 48 million prostate tissue tile images. We demonstrate that HistoEncoder features extracted from tile images with similar histological patterns map closely together in the feature space. HistoEncoder outperforms models pre-trained with natural images, even without fine-tuning or with 1000 times less training data. We describe two use cases that leverage the capabilities of HistoEncoder by fine-tuning the model with a limited amount of data and computational resources. First, we show how HistoEncoder can be used to automatically annotate large-scale datasets with high accuracy. Second, we combine histomics with commonly used clinical nomograms, significantly improving prostate cancer-specific death survival models. Foundation models such as HistoEncoder can allow organizations with limited resources to build effective clinical software tools without needing extensive datasets or significant amounts of computing.

Paper Structure

This paper contains 17 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: An overview of the HistoEncoder workflow. HistoEncoder utilizes the cross-covariance transformer (XCiT) as the backbone network. The models are pre-trained with 48 million prostate tissue tile images from 1,307 patients in a self-supervised manner (Step 1), and fine-tuned to cancer classification and mortality prediction tasks, or used without fine-tuning via a KNN classifier (Step 2).
  • Figure 2: AUROC scores for the prostate cancer classifiers fine-tuned from pre-trained prostate-s and natural-s encoder models using the PESO dataset. Prostate-s based models achieve higher AUROC scores than natural-s based models with less fine-tuned parameters (a) and training data (b). A simple KNN classifier requiring no training significantly outperforms a fully fine-tuned natural-s encoder model on all evaluation datasets.
  • Figure 3: UMAP representation of the features extracted from the epithelium tissue in the Radboud dataset. Benign and cancerous epithelium, as well as different Gleason grades, form clear clusters in the features extracted with the prostate-s encoder, while features extracted with the natural-s model are more mixed.
  • Figure 4: Cluster label purities for different labels in the Karolinska and Radboud datasets. A significant proportion of tile images in these datasets are contained in clusters with greater than 90% label purity for cancerous vs benign tissue (a), stroma vs. epithelium tissue (b left), and cancerous vs. benign epithelium tissue (b right).
  • Figure 5: Head-to-head concordance comparisons between 1,000 stratified random splits (left), time-dependent AUC scores (centre) and net benefit curves for prediction of prostate cancer-specific death at 15 years (right). Including patient-level feature cluster percentages improves concordance, time-dependent AUC score and provides a higher net benefit over a wide threshold probability for the Gleason grade (a), CAPRA-S (b) and MSKCC-S (c) nomograms.
  • ...and 4 more figures