Table of Contents
Fetching ...

Domain Adaptation Without the Compute Burden for Efficient Whole Slide Image Analysis

Umar Marikkar, Muhammad Awais, Sara Atito

Abstract

Computational methods on analyzing Whole Slide Images (WSIs) enable early diagnosis and treatments by supporting pathologists in detection and classification of tumors. However, the extremely high resolution of WSIs makes end-to-end training impractical compared to typical image analysis tasks. To address this, most approaches use pre-trained feature extractors to obtain fixed representations of whole slides, which are then combined with Multiple Instance Learning (MIL) for downstream tasks. These feature extractors are typically pre-trained on natural image datasets such as ImageNet, which fail to capture domain-specific characteristics. Although domain-specific pre-training on histopathology data yields more relevant feature representations, it remains computationally expensive and fail to capture task-specific characteristics within the domain. To address the computational cost and lack of task-specificity in domain-specific pre-training, we propose EfficientWSI (eWSI), a careful integration of Parameter-Efficient-Fine-Tuning (PEFT) and Multiple Instance Learning (MIL) that enables end-to-end training on WSI tasks. We evaluate eWSI on seven WSI-level tasks over Camelyon16, TCGA and BRACS datasets. Our results show that eWSI when applied with ImageNet feature extractors yields strong classification performance, matching or outperforming MILs with in-domain feature extractors, alleviating the need for extensive in-domain pre-training. Furthermore, when eWSI is applied with in-domain feature extractors, it further improves classification performance in most cases, demonstrating its ability to capture task-specific information where beneficial. Our findings suggest that eWSI provides a task-targeted, computationally efficient path for WSI tasks, offering a promising direction for task-specific learning in computational pathology.

Domain Adaptation Without the Compute Burden for Efficient Whole Slide Image Analysis

Abstract

Computational methods on analyzing Whole Slide Images (WSIs) enable early diagnosis and treatments by supporting pathologists in detection and classification of tumors. However, the extremely high resolution of WSIs makes end-to-end training impractical compared to typical image analysis tasks. To address this, most approaches use pre-trained feature extractors to obtain fixed representations of whole slides, which are then combined with Multiple Instance Learning (MIL) for downstream tasks. These feature extractors are typically pre-trained on natural image datasets such as ImageNet, which fail to capture domain-specific characteristics. Although domain-specific pre-training on histopathology data yields more relevant feature representations, it remains computationally expensive and fail to capture task-specific characteristics within the domain. To address the computational cost and lack of task-specificity in domain-specific pre-training, we propose EfficientWSI (eWSI), a careful integration of Parameter-Efficient-Fine-Tuning (PEFT) and Multiple Instance Learning (MIL) that enables end-to-end training on WSI tasks. We evaluate eWSI on seven WSI-level tasks over Camelyon16, TCGA and BRACS datasets. Our results show that eWSI when applied with ImageNet feature extractors yields strong classification performance, matching or outperforming MILs with in-domain feature extractors, alleviating the need for extensive in-domain pre-training. Furthermore, when eWSI is applied with in-domain feature extractors, it further improves classification performance in most cases, demonstrating its ability to capture task-specific information where beneficial. Our findings suggest that eWSI provides a task-targeted, computationally efficient path for WSI tasks, offering a promising direction for task-specific learning in computational pathology.
Paper Structure (35 sections, 6 equations, 11 figures, 7 tables)

This paper contains 35 sections, 6 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Moving from ImageNet to domain specific encoders, an easy path through eWSI (radial axis is %AUC).
  • Figure 2: Robustness of MILs to patch sampling for each initialization under frozen encoder settings. $\bigstar$ denotes the PEFT+Lin-Max$_3$ performance under that setting. Experiments are conducted on Camelyon16, which is particularly sensitive to sampling rate due to smaller regions of interest.
  • Figure 3: Camelyon16 performance vs. varying LoRA rank $r$. (seed=0).
  • Figure 4: An outline of eWSI. The input patches are randomly sampled and fed into the $\mathtt{LoRA}$ encoder. The encoded patches are then passed through the LinMax aggregator to yield the global WSI representation $\mathbf{z}$.
  • Figure 5: Patch-predictions versus tumor locations for the 4 observed false negatives using eWSI iNet-SSL and $M=64$. Yellow dots in the heatmaps indicate locations with high probability. The slide IDs are top left:$\mathtt{test\_038}$, top right:$\mathtt{test\_011}$, bottom left:$\mathtt{test\_013}$, and bottom right:$\mathtt{test\_099}$.
  • ...and 6 more figures