Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology
Tiago Gonçalves, Dagoberto Pulido-Arias, Julian Willett, Katharina V. Hoebel, Mason Cleveland, Syed Rakin Ahmed, Elizabeth Gerstner, Jayashree Kalpathy-Cramer, Jaime S. Cardoso, Christopher P. Bridge, Albert E. Kim
TL;DR
The paper tackles the problem of deriving reproducible tumor microenvironment phenotypes for individual breast cancer patients by predicting pathway activity from hematoxylin and eosin slides using MIL-based deep learning. It compares CLAM and TransMIL architectures with two feature extractors (ResNet50/ImageNet and PLIP/OpenPath) and uses ssGSEA-derived binary pathway labels to train models. The results show AUROCs above 0.70 for most pathways and 0.75–0.80 for several immune-related programs, with PLIP features and attention maps supporting biologically meaningful learning. This work demonstrates the feasibility of computational H&E biomarkers for precision oncology and provides a publicly available pipeline with directions for multi-modal extensions.
Abstract
The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activity of ten biologically relevant pathways from the hematoxylin and eosin (H&E) slide of primary breast tumors. We employed different feature extraction approaches and state-of-the-art model architectures. Using binary classification, our models attained area under the receiver operating characteristic (AUROC) scores above 0.70 for nearly all gene expression pathways and on some cases, exceeded 0.80. Attention maps suggest that our trained models recognize biologically relevant spatial patterns of cell sub-populations from H&E. These efforts represent a first step towards developing computational H&E biomarkers that reflect facets of the TME and hold promise for augmenting precision oncology.
