Table of Contents
Fetching ...

HistoMet: A Pan-Cancer Deep Learning Framework for Prognostic Prediction of Metastatic Progression and Site Tropism from Primary Tumor Histopathology

Yixin Chen, Ziyu Su, Lingbin Meng, Elshad Hasanov, Wei Chen, Anil Parwani, M. Khalid Khan Niazi

TL;DR

HistoMet tackles the challenge of predicting metastatic progression and site tropism from primary tumor histology by proposing a two-stage, decision-aware MIL framework that gates downstream site prediction on high-risk cases. It integrates vision-language based semantic concepts with data-driven visual prototypes, using multi-scale (10x and 20x) features and a cross-attention prototype condensation mechanism to produce interpretable slide-level predictions. The approach achieves state-of-the-art performance on a large pan-cancer dataset, reduces downstream workload, and provides qualitative interpretability through prototype-level visual-semantic mappings, while highlighting practical considerations such as potential LLM prompt hallucinations and frozen encoders. These advances suggest a path toward deployment-ready, decision-aware prognostic tools in clinical oncology that can generalize across cancer types and metastatic sites.

Abstract

Metastatic Progression remains the leading cause of cancer-related mortality, yet predicting whether a primary tumor will metastasize and where it will disseminate directly from histopathology remains a fundamental challenge. Although whole-slide images (WSIs) provide rich morphological information, prior computational pathology approaches typically address metastatic status or site prediction as isolated tasks, and do not explicitly model the clinically sequential decision process of metastatic risk assessment followed by downstream site-specific evaluation. To address this research gap, we present a decision-aware, concept-aligned MIL framework, HistoMet, for prognostic metastatic outcome prediction from primary tumor WSIs. Our proposed framework adopts a two-module prediction pipeline in which the likelihood of metastatic progression from the primary tumor is first estimated, followed by conditional prediction of metastatic site for high-risk cases. To guide representation learning and improve clinical interpretability, our framework integrates linguistically defined and data-adaptive metastatic concepts through a pretrained pathology vision-language model. We evaluate HistoMet on a multi-institutional pan-cancer cohort of 6504 patients with metastasis follow-up and site annotations. Under clinically relevant high-sensitivity screening settings (95 percent sensitivity), HistoMet significantly reduces downstream workload while maintaining high metastatic risk recall. Conditional on metastatic cases, HistoMet achieves a macro F1 of 74.6 with a standard deviation of 1.3 and a macro one-vs-rest AUC of 92.1. These results demonstrate that explicitly modeling clinical decision structure enables robust and deployable prognostic prediction of metastatic progression and site tropism directly from primary tumor histopathology.

HistoMet: A Pan-Cancer Deep Learning Framework for Prognostic Prediction of Metastatic Progression and Site Tropism from Primary Tumor Histopathology

TL;DR

HistoMet tackles the challenge of predicting metastatic progression and site tropism from primary tumor histology by proposing a two-stage, decision-aware MIL framework that gates downstream site prediction on high-risk cases. It integrates vision-language based semantic concepts with data-driven visual prototypes, using multi-scale (10x and 20x) features and a cross-attention prototype condensation mechanism to produce interpretable slide-level predictions. The approach achieves state-of-the-art performance on a large pan-cancer dataset, reduces downstream workload, and provides qualitative interpretability through prototype-level visual-semantic mappings, while highlighting practical considerations such as potential LLM prompt hallucinations and frozen encoders. These advances suggest a path toward deployment-ready, decision-aware prognostic tools in clinical oncology that can generalize across cancer types and metastatic sites.

Abstract

Metastatic Progression remains the leading cause of cancer-related mortality, yet predicting whether a primary tumor will metastasize and where it will disseminate directly from histopathology remains a fundamental challenge. Although whole-slide images (WSIs) provide rich morphological information, prior computational pathology approaches typically address metastatic status or site prediction as isolated tasks, and do not explicitly model the clinically sequential decision process of metastatic risk assessment followed by downstream site-specific evaluation. To address this research gap, we present a decision-aware, concept-aligned MIL framework, HistoMet, for prognostic metastatic outcome prediction from primary tumor WSIs. Our proposed framework adopts a two-module prediction pipeline in which the likelihood of metastatic progression from the primary tumor is first estimated, followed by conditional prediction of metastatic site for high-risk cases. To guide representation learning and improve clinical interpretability, our framework integrates linguistically defined and data-adaptive metastatic concepts through a pretrained pathology vision-language model. We evaluate HistoMet on a multi-institutional pan-cancer cohort of 6504 patients with metastasis follow-up and site annotations. Under clinically relevant high-sensitivity screening settings (95 percent sensitivity), HistoMet significantly reduces downstream workload while maintaining high metastatic risk recall. Conditional on metastatic cases, HistoMet achieves a macro F1 of 74.6 with a standard deviation of 1.3 and a macro one-vs-rest AUC of 92.1. These results demonstrate that explicitly modeling clinical decision structure enables robust and deployable prognostic prediction of metastatic progression and site tropism directly from primary tumor histopathology.
Paper Structure (23 sections, 22 equations, 11 figures, 1 algorithm)

This paper contains 23 sections, 22 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: Overview of the proposed decision-aware, concept-aligned prognostic framework.
  • Figure 2: End-to-end performance evaluation of the HistoMet two-module framework for metastatic progression prediction.A. End-to-end accuracy comparison across four models (HistoMet, ABMIL, CLAM, TransMIL) at two operating points (Target Sensitivity = 0.95 and 0.90). B. Dumbbell plot showing E2E F1 score changes when relaxing target sensitivity from 0.95 to 0.90. C. Workload comparison indicating the proportion of cases requiring expert review. D. Conditional site prediction metrics for Module B, evaluated only on cases correctly identified as metastatic by Module A. E. Bump chart illustrating model ranking changes across operating points. F. Radar plot summarizing multi-metric profiles across E2E and conditional metrics. G. End-to-end F1 score comparison against baseline methods. H. Performance summary heatmap with best performers highlighted. I. Forest plot with 95% confidence intervals for E2E accuracy at each sensitivity threshold. J. Statistical comparison of F1 scores with significance annotations. K. Forest plot with 95% confidence intervals for E2E F1 at each sensitivity threshold. L. Statistical comparison of accuracy scores with significance annotations.
  • Figure 3: Comprehensive evaluation of Module A binary screening performance. The Sankey diagram illustrates patient-level case flow from initial binary screening (Module A) to final 5-class prediction across the full test cohort (n = 6,857). Module A filters 23.0% of cases as predicted primary tumors, while 77.0% are forwarded to Module B for metastatic site prediction. Among filtered cases, 1,409 are correctly identified as primary tumors (true negatives), whereas 168 metastatic cases are missed (false negatives). Cases forwarded to Module B include 2,836 true metastatic cases (true positives) and 2,444 false-positive primary cases, representing unnecessary workload. Module B performs 4-class metastatic site prediction (lymph node, soft tissue, brain, liver). Overall, the system achieves 55.75% end-to-end 5-class accuracy with a 23.0% workload reduction compared to processing all cases. Flow width is proportional to case count; green and red flows denote correct and incorrect decisions, respectively.
  • Figure 4: Comprehensive evaluation of Module A binary screening performance.
  • Figure 5: Comprehensive evaluation of Module B site prediction performance.
  • ...and 6 more figures