Prompting Whole Slide Image Based Genetic Biomarker Prediction
Ling Zhang, Boxiang Yun, Xingran Xie, Qingli Li, Xinxing Li, Yan Wang
TL;DR
The work tackles predicting genetic biomarkers (e.g., microsatellite instability MSI and BRAF mutations) from gigapixel whole-slide images (WSIs) of colorectal cancer. It proposes PromptBio, a three-module framework that uses large language model prompts to guide coarse-grained foreground selection, fine-grained component grouping into four pathology components, and their interaction mining with Transformer-based modeling. By leveraging LLM-generated pathology prompts and a coarse-to-fine strategy focused on cancer-associated stroma, PromptBio achieves strong MSI biomarker prediction with clear interpretability, demonstrated by MSI AUCs of 0.9149 on TCGA and 0.9125 on CPTAC, outperforming state-of-the-art MIL baselines. The approach, validated on two CRC cohorts and accompanied by publicly available code, suggests a practical path toward interpretable, prompt-guided WSI-based biomarker prediction in clinical settings.
Abstract
Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, and (2) the interaction among the fine-grained pathological components in WSIs.Specifically, we leverage large language models to generate medical prompts that serve as prior knowledge in extracting instances associated with genetic biomarkers. We adopt a coarse-to-fine approach to mine biomarker information within the tumor microenvironment. This involves extracting instances related to genetic biomarkers using coarse medical prior knowledge, grouping pathology instances into fine-grained pathological components and mining their interactions. Experimental results on two colorectal cancer datasets show the superiority of our method, achieving 91.49% in AUC for MSI classification. The analysis further shows the clinical interpretability of our method. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/PromptBio.
