Prototype-Guided Cross-Modal Knowledge Enhancement for Adaptive Survival Prediction
Fengchun Liu, Linghan Cai, Zhikang Wang, Zhiyuan Fan, Jin-gang Yu, Hao Chen, Yongbing Zhang
TL;DR
ProSurv tackles histo-genomic survival prediction in settings where paired multimodal data are unavailable. It introduces intra-modal prototype banks and a prototype-guided cross-modal translation module to enable knowledge transfer across modalities without requiring paired data. Through risk-contrastive learning and event-aware sampling, the framework preserves modality-specific and time-interval–relevant risk information and translates missing modalities to improve prediction. Evaluations on four TCGA datasets show state-of-the-art performance for multimodal and strong unimodal performance, highlighting practical value for precision medicine in clinical workflows.
Abstract
Histo-genomic multimodal survival prediction has garnered growing attention for its remarkable model performance and potential contributions to precision medicine. However, a significant challenge in clinical practice arises when only unimodal data is available, limiting the usability of these advanced multimodal methods. To address this issue, this study proposes a prototype-guided cross-modal knowledge enhancement (ProSurv) framework, which eliminates the dependency on paired data and enables robust learning and adaptive survival prediction. Specifically, we first introduce an intra-modal updating mechanism to construct modality-specific prototype banks that encapsulate the statistics of the whole training set and preserve the modality-specific risk-relevant features/prototypes across intervals. Subsequently, the proposed cross-modal translation module utilizes the learned prototypes to enhance knowledge representation for multimodal inputs and generate features for missing modalities, ensuring robust and adaptive survival prediction across diverse scenarios. Extensive experiments on four public datasets demonstrate the superiority of ProSurv over state-of-the-art methods using either unimodal or multimodal input, and the ablation study underscores its feasibility for broad applicability. Overall, this study addresses a critical practical challenge in computational pathology, offering substantial significance and potential impact in the field.
