Table of Contents
Fetching ...

Promptable Representation Distribution Learning and Data Augmentation for Gigapixel Histopathology WSI Analysis

Kunming Tang, Zhiguo Jiang, Jun Shi, Wei Wang, Haibo Wu, Yushan Zheng

TL;DR

The paper tackles the challenge of data augmentation for gigapixel WSI classification under MIL, where fixed patch representations hinder augmentation and efficiency. It introduces Promptable Representation Distribution Learning (PRDL), which learns a patch-level representation distribution and uses augmentation prompts to guide feature-space augmentation, integrated with a DINO-style SSL backbone. A promptable representation sampling (PRS) module enables online sampling from these distributions during WSI training, providing controllable, efficient augmentation without extra model parameters for inference. Empirical results across three lung-related datasets show that PRDL with PRS consistently outperforms state-of-the-art representation learning baselines and other WSI augmentation methods, demonstrating improved robustness and MIL-based WSI classification performance. The work offers a scalable pathway to diversify patch representations with principled control, enhancing WSI analysis in pathological imaging.

Abstract

Gigapixel image analysis, particularly for whole slide images (WSIs), often relies on multiple instance learning (MIL). Under the paradigm of MIL, patch image representations are extracted and then fixed during the training of the MIL classifiers for efficiency consideration. However, the invariance of representations makes it difficult to perform data augmentation for WSI-level model training, which significantly limits the performance of the downstream WSI analysis. The current data augmentation methods for gigapixel images either introduce additional computational costs or result in a loss of semantic information, which is hard to meet the requirements for efficiency and stability needed for WSI model training. In this paper, we propose a Promptable Representation Distribution Learning framework (PRDL) for both patch-level representation learning and WSI-level data augmentation. Meanwhile, we explore the use of prompts to guide data augmentation in feature space, which achieves promptable data augmentation for training robust WSI-level models. The experimental results have demonstrated that the proposed method stably outperforms state-of-the-art methods.

Promptable Representation Distribution Learning and Data Augmentation for Gigapixel Histopathology WSI Analysis

TL;DR

The paper tackles the challenge of data augmentation for gigapixel WSI classification under MIL, where fixed patch representations hinder augmentation and efficiency. It introduces Promptable Representation Distribution Learning (PRDL), which learns a patch-level representation distribution and uses augmentation prompts to guide feature-space augmentation, integrated with a DINO-style SSL backbone. A promptable representation sampling (PRS) module enables online sampling from these distributions during WSI training, providing controllable, efficient augmentation without extra model parameters for inference. Empirical results across three lung-related datasets show that PRDL with PRS consistently outperforms state-of-the-art representation learning baselines and other WSI augmentation methods, demonstrating improved robustness and MIL-based WSI classification performance. The work offers a scalable pathway to diversify patch representations with principled control, enhancing WSI analysis in pathological imaging.

Abstract

Gigapixel image analysis, particularly for whole slide images (WSIs), often relies on multiple instance learning (MIL). Under the paradigm of MIL, patch image representations are extracted and then fixed during the training of the MIL classifiers for efficiency consideration. However, the invariance of representations makes it difficult to perform data augmentation for WSI-level model training, which significantly limits the performance of the downstream WSI analysis. The current data augmentation methods for gigapixel images either introduce additional computational costs or result in a loss of semantic information, which is hard to meet the requirements for efficiency and stability needed for WSI model training. In this paper, we propose a Promptable Representation Distribution Learning framework (PRDL) for both patch-level representation learning and WSI-level data augmentation. Meanwhile, we explore the use of prompts to guide data augmentation in feature space, which achieves promptable data augmentation for training robust WSI-level models. The experimental results have demonstrated that the proposed method stably outperforms state-of-the-art methods.

Paper Structure

This paper contains 34 sections, 14 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Comparison between existing methods for WSI data augmentation and our method. (a) represents the traditional image augmentation used in natural images, which is inefficient. (b) involves the use of generative models for data augmentation in feature space. (c) describes our promptable representation sampling strategy tailed for WSI augmentation.
  • Figure 2: The proposed representation learning and WSI data augmentation framework includes (a) the process of PRDL, where the two student branches share weights in the encoder and head, (b) and (c) provide detailed descriptions of the modules in (a), and (d) shows the flowchart of WSI augmentation during training.
  • Figure 3: Comparisons with SOTA Methods. Please refer to the supplemental material in the extended version for complete numerical results.
  • Figure 4: Dimensional impact between different augmentation prompts on the USTC-EGFR dataset, where (a) is the image augmentations corresponding to the prompts. (b) is the cosine similarities of the augmentation masks M.
  • Figure 5: Effects of the hyper-parameters on the USTC-EGFR validation subset under the CLAM benchmark.