BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion
Raktim Kumar Mondol, Ewan K. A. Millar, Arcot Sowmya, Erik Meijering
TL;DR
ER+ breast cancer prognosis remains challenging due to heterogeneity across imaging, genomics, and clinical data. BioFusionNet tackles this with a multimodal pipeline that fuses self-supervised histopathology features from DINO and MoCoV3 with gene expression and clinical variables via a variational autoencoder, co-dual-cross-attention, and late fusion, optimized by a weighted Cox loss. The model achieves a mean C-index of $0.77$ and a time-dependent AUC of $0.84$, with significant hazard stratification (OS HR ≈ $2.99$ univariate, $2.91$ multivariate) and interpretability through attention maps and SHAP analyses. This approach demonstrates the value of comprehensive data integration for accurate prognosis in ER+ breast cancer and offers a foundation for clinical translation, though it carries substantial computational cost and requires broader validation across datasets and cancer types.
Abstract
Breast cancer is a significant health concern affecting millions of women worldwide. Accurate survival risk stratification plays a crucial role in guiding personalised treatment decisions and improving patient outcomes. Here we present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to obtain a holistic profile and achieve survival risk stratification of ER+ breast cancer patients. We employ multiple self-supervised feature extractors (DINO and MoCoV3) pretrained on histopathological patches to capture detailed image features. These features are then fused by a variational autoencoder and fed to a self-attention network generating patient-level features. A co-dual-cross-attention mechanism combines the histopathological features with genetic data, enabling the model to capture the interplay between them. Additionally, clinical data is incorporated using a feed-forward network, further enhancing predictive performance and achieving comprehensive multimodal feature integration. Furthermore, we introduce a weighted Cox loss function, specifically designed to handle imbalanced survival data, which is a common challenge. Our model achieves a mean concordance index of 0.77 and a time-dependent area under the curve of 0.84, outperforming state-of-the-art methods. It predicts risk (high versus low) with prognostic significance for overall survival in univariate analysis (HR=2.99, 95% CI: 1.88--4.78, p<0.005), and maintains independent significance in multivariate analysis incorporating standard clinicopathological variables (HR=2.91, 95\% CI: 1.80--4.68, p<0.005).
