Table of Contents
Fetching ...

Can We Simplify Slide-level Fine-tuning of Pathology Foundation Models?

Jiawen Li, Jiali Hu, Qiehe Sun, Renao Yan, Minxi Ouyang, Tian Guan, Anjia Han, Chao He, Yonghong He

TL;DR

This work questions the necessity of complex MIL-based fine-tuning for pathology foundation models and proposes SiMLP, a simple nonlinear head on mean-pooled patch features, as a task-agnostic slide-level adapter. Across seven large-scale datasets and multiple foundation models, SiMLP achieves state-of-the-art or competitive performance in slide-level classification, shows strong few-shot learning capability, and demonstrates robust transferability to external cohorts. The results suggest that task-agnostic slide representations can generalize better and offer more scalable deployment than traditional weakly supervised approaches, while also acknowledging that MIL methods may still be advantageous for certain specialized tasks. Overall, the study advocates simplifying fine-tuning pipelines to broaden applicability and efficiency in digital pathology while guiding future exploration of patch-to-slide representation learning and specialized weakly supervised strategies.

Abstract

The emergence of foundation models in computational pathology has transformed histopathological image analysis, with whole slide imaging (WSI) diagnosis being a core application. Traditionally, weakly supervised fine-tuning via multiple instance learning (MIL) has been the primary method for adapting foundation models to WSIs. However, in this work we present a key experimental finding: a simple nonlinear mapping strategy combining mean pooling and a multilayer perceptron, called SiMLP, can effectively adapt patch-level foundation models to slide-level tasks without complex MIL-based learning. Through extensive experiments across diverse downstream tasks, we demonstrate the superior performance of SiMLP with state-of-the-art methods. For instance, on a large-scale pan-cancer classification task, SiMLP surpasses popular MIL-based methods by 3.52%. Furthermore, SiMLP shows strong learning ability in few-shot classification and remaining highly competitive with slide-level foundation models pretrained on tens of thousands of slides. Finally, SiMLP exhibits remarkable robustness and transferability in lung cancer subtyping. Overall, our findings challenge the conventional MIL-based fine-tuning paradigm, demonstrating that a task-agnostic representation strategy alone can effectively adapt foundation models to WSI analysis. These insights offer a unique and meaningful perspective for future research in digital pathology, paving the way for more efficient and broadly applicable methodologies.

Can We Simplify Slide-level Fine-tuning of Pathology Foundation Models?

TL;DR

This work questions the necessity of complex MIL-based fine-tuning for pathology foundation models and proposes SiMLP, a simple nonlinear head on mean-pooled patch features, as a task-agnostic slide-level adapter. Across seven large-scale datasets and multiple foundation models, SiMLP achieves state-of-the-art or competitive performance in slide-level classification, shows strong few-shot learning capability, and demonstrates robust transferability to external cohorts. The results suggest that task-agnostic slide representations can generalize better and offer more scalable deployment than traditional weakly supervised approaches, while also acknowledging that MIL methods may still be advantageous for certain specialized tasks. Overall, the study advocates simplifying fine-tuning pipelines to broaden applicability and efficiency in digital pathology while guiding future exploration of patch-to-slide representation learning and specialized weakly supervised strategies.

Abstract

The emergence of foundation models in computational pathology has transformed histopathological image analysis, with whole slide imaging (WSI) diagnosis being a core application. Traditionally, weakly supervised fine-tuning via multiple instance learning (MIL) has been the primary method for adapting foundation models to WSIs. However, in this work we present a key experimental finding: a simple nonlinear mapping strategy combining mean pooling and a multilayer perceptron, called SiMLP, can effectively adapt patch-level foundation models to slide-level tasks without complex MIL-based learning. Through extensive experiments across diverse downstream tasks, we demonstrate the superior performance of SiMLP with state-of-the-art methods. For instance, on a large-scale pan-cancer classification task, SiMLP surpasses popular MIL-based methods by 3.52%. Furthermore, SiMLP shows strong learning ability in few-shot classification and remaining highly competitive with slide-level foundation models pretrained on tens of thousands of slides. Finally, SiMLP exhibits remarkable robustness and transferability in lung cancer subtyping. Overall, our findings challenge the conventional MIL-based fine-tuning paradigm, demonstrating that a task-agnostic representation strategy alone can effectively adapt foundation models to WSI analysis. These insights offer a unique and meaningful perspective for future research in digital pathology, paving the way for more efficient and broadly applicable methodologies.

Paper Structure

This paper contains 14 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Transition of slide-level adaption in pathology foundation models.a. Conventional fine-tuning strategy using task-specific supervised learning. b. Simplified fine-tuning strategy using task-agnostic pooling and nonlinear classifier (SiMLP). c. Comparison of SiMLP and other MIL-based fine-tuning methods across three pathology foundation models.
  • Figure 2: Few-shot slide-level performance on TCGA and CPTAC cohort with $K\in\{1,5,10,20,50\}$ slides per class.
  • Figure 3: Robustness and transfer testing evaluation on CPTAC, TCGA, and in-house NSCLC cohort by sweeping 10 random seeds.