Table of Contents
Fetching ...

EXAONEPath 1.0 Patch-level Foundation Model for Pathology

Juseung Yun, Yi Hu, Jinhyung Kim, Jongseong Jang, Soonyoung Lee

TL;DR

The paper identifies a WSI-specific feature collapse in self-supervised patch-based learning for digital pathology and mitigates it by introducing EXAONEPath, a patch-level foundation model trained on Macenko-normalized patches using DINO. This approach yields more generalized, color-robust features and achieves competitive performance across six patch-level downstream tasks with fewer WSIs and smaller model size. The study demonstrates substantial improvements in learning efficiency and generalization when stain normalization is incorporated into pretraining, while also acknowledging residual collapse that invites further research. Overall, EXAONEPath advances efficient, generalizable pathology analysis by integrating stain normalization into foundation-model pretraining.

Abstract

Recent advancements in digital pathology have led to the development of numerous foundational models that utilize self-supervised learning on patches extracted from gigapixel whole slide images (WSIs). While this approach leverages vast amounts of unlabeled data, we have discovered a significant issue: features extracted from these self-supervised models tend to cluster by individual WSIs, a phenomenon we term WSI-specific feature collapse. This problem can potentially limit the model's generalization ability and performance on various downstream tasks. To address this issue, we introduce EXAONEPath, a novel foundational model trained on patches that have undergone stain normalization. Stain normalization helps reduce color variability arising from different laboratories and scanners, enabling the model to learn more consistent features. EXAONEPath is trained using 285,153,903 patches extracted from a total of 34,795 WSIs. Our experiments demonstrate that EXAONEPath significantly mitigates the feature collapse problem, indicating that the model has learned more generalized features rather than overfitting to individual WSI characteristics. We compared EXAONEPath with state-of-the-art models across six downstream task datasets, and our results show that EXAONEPath achieves superior performance relative to the number of WSIs used and the model's parameter count. This suggests that the application of stain normalization has substantially improved the model's efficiency and generalization capabilities.

EXAONEPath 1.0 Patch-level Foundation Model for Pathology

TL;DR

The paper identifies a WSI-specific feature collapse in self-supervised patch-based learning for digital pathology and mitigates it by introducing EXAONEPath, a patch-level foundation model trained on Macenko-normalized patches using DINO. This approach yields more generalized, color-robust features and achieves competitive performance across six patch-level downstream tasks with fewer WSIs and smaller model size. The study demonstrates substantial improvements in learning efficiency and generalization when stain normalization is incorporated into pretraining, while also acknowledging residual collapse that invites further research. Overall, EXAONEPath advances efficient, generalizable pathology analysis by integrating stain normalization into foundation-model pretraining.

Abstract

Recent advancements in digital pathology have led to the development of numerous foundational models that utilize self-supervised learning on patches extracted from gigapixel whole slide images (WSIs). While this approach leverages vast amounts of unlabeled data, we have discovered a significant issue: features extracted from these self-supervised models tend to cluster by individual WSIs, a phenomenon we term WSI-specific feature collapse. This problem can potentially limit the model's generalization ability and performance on various downstream tasks. To address this issue, we introduce EXAONEPath, a novel foundational model trained on patches that have undergone stain normalization. Stain normalization helps reduce color variability arising from different laboratories and scanners, enabling the model to learn more consistent features. EXAONEPath is trained using 285,153,903 patches extracted from a total of 34,795 WSIs. Our experiments demonstrate that EXAONEPath significantly mitigates the feature collapse problem, indicating that the model has learned more generalized features rather than overfitting to individual WSI characteristics. We compared EXAONEPath with state-of-the-art models across six downstream task datasets, and our results show that EXAONEPath achieves superior performance relative to the number of WSIs used and the model's parameter count. This suggests that the application of stain normalization has substantially improved the model's efficiency and generalization capabilities.
Paper Structure (13 sections, 3 figures, 1 table)

This paper contains 13 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Performance comparison of models based on the number of parameters and the number of WSIs used for training. The average Top-1 accuracy represents the mean linear evaluation performance across six downstream tasks. (a) Average Top-1 accuracy versus the number of parameters. (b) Average Top-1 accuracy versus the number of WSIs used for training. Notably, our model (EXAONEPath) achieves high performance despite having fewer parameters and using fewer WSIs compared to other models, demonstrating its efficiency.
  • Figure 2: t-SNE visualization of features extracted from a foundation model trained with DINO. 1000 patches are randomly sampled from each of 10 arbitrarily selected WSIs, and the features from each of the 10 WSIs are represented in different colors. Despite having no information about the source WSI for each patch, the model exhibits WSI-specific feature collapse, where features tend to cluster by WSI. (a) Features obtained by inputting patches without any stain normalization into a model trained on non-stain-normalized images, showing severe feature collapse. (b) Features obtained by inputting stain-normalized patches into a model trained on non-stain-normalized images, showing significant but reduced collapse compared to (a). (c) Features obtained by inputting stain-normalized patches into a model trained on stain-normalized images, showing significantly reduced feature collapse, as proposed by our method.
  • Figure 3: MPP distribution of the training data. Most of the data used for training is concentrated around 0.25 MPP and 0.5 MPP.