MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification

Junjie Zhou; Wei Shao; Yagao Yue; Wei Mu; Peng Wan; Qi Zhu; Daoqiang Zhang

MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification

Junjie Zhou, Wei Shao, Yagao Yue, Wei Mu, Peng Wan, Qi Zhu, Daoqiang Zhang

TL;DR

MAPLE tackles few-shot WSI classification by marrying MIL with vision-language prompting through a hierarchical, multi-scale approach. It uses LLM-generated entity- and slide-level prompts, language-guided instance selection, and a cross-scale graph to fuse fine-grained histology with global slide context, producing entity- and slide-level predictions that are then combined. Ablation and visualization analyses validate the effectiveness and interpretability of the entity-level prompts and cross-scale reasoning, while experiments on TCGA cohorts demonstrate robust improvements over state-of-the-art MIL and prompt-based methods. By aligning with pathologists' diagnostic workflows and reducing annotation burden, MAPLE offers a practical, interpretable solution for pathology AI in the few-shot regime.

Abstract

Prompt learning has emerged as a promising paradigm for adapting pre-trained vision-language models (VLMs) to few-shot whole slide image (WSI) classification by aligning visual features with textual representations, thereby reducing annotation cost and enhancing model generalization. Nevertheless, existing methods typically rely on slide-level prompts and fail to capture the subtype-specific phenotypic variations of histological entities (\emph{e.g.,} nuclei, glands) that are critical for cancer diagnosis. To address this gap, we propose Multi-scale Attribute-enhanced Prompt Learning (\textbf{MAPLE}), a hierarchical framework for few-shot WSI classification that jointly integrates multi-scale visual semantics and performs prediction at both the entity and slide levels. Specifically, we first leverage large language models (LLMs) to generate entity-level prompts that can help identify multi-scale histological entities and their phenotypic attributes, as well as slide-level prompts to capture global visual descriptions. Then, an entity-guided cross-attention module is proposed to generate entity-level features, followed by aligning with their corresponding subtype-specific attributes for fine-grained entity-level prediction. To enrich entity representations, we further develop a cross-scale entity graph learning module that can update these representations by capturing their semantic correlations within and across scales. The refined representations are then aggregated into a slide-level representation and aligned with the corresponding prompts for slide-level prediction. Finally, we combine both entity-level and slide-level outputs to produce the final prediction results. Results on three cancer cohorts confirm the effectiveness of our approach in addressing few-shot pathology diagnosis tasks.

MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification

TL;DR

Abstract

MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)