PRISM: A Framework Harnessing Unsupervised Visual Representations and Textual Prompts for Explainable MACE Survival Prediction from Cardiac Cine MRI
Haoyang Su, Jin-Yi Xiang, Shaohao Rui, Yifan Gao, Xingyu Chen, Tingxuan Yin, Shaoting Zhang, Xiaosong Wang, Lian-Ming Wu
TL;DR
PRISM addresses MACE prediction by fusing unsupervised visual representations from non-contrast cine MRI with structured EHR data in a three-stage survival modeling framework. It combines motion-aware multi-view distillation (Stage I), prompt-guided cross-modal EHR alignment (Stage II), and CoxPH-based fusion for survival prediction (Stage III), with alignment losses ensuring semantic and structural coherence. The approach reveals three spatiotemporal imaging signatures—lateral wall dyssynchrony, inferior hypersensitivity, and anterior diastolic elevated focus—whose patterns map to coronary territories, and identifies hypertension, diabetes, and smoking as key EHR drivers via BiPromptSurv attribution. Across four independent cohorts under IECV, PRISM outperforms classical and SOTA baselines, demonstrating robust generalization and offering interpretable, annotation-free risk stratification for practical cardiovascular prognosis, with the survival model defined by $h(t|\mathbf{x}_{\mathrm{fused}}) = h_0(t) \exp\left(\sum_k \beta_k x_{\mathrm{fused}_k}\right)$.
Abstract
Accurate prediction of major adverse cardiac events (MACE) remains a central challenge in cardiovascular prognosis. We present PRISM (Prompt-guided Representation Integration for Survival Modeling), a self-supervised framework that integrates visual representations from non-contrast cardiac cine magnetic resonance imaging with structured electronic health records (EHRs) for survival analysis. PRISM extracts temporally synchronized imaging features through motion-aware multi-view distillation and modulates them using medically informed textual prompts to enable fine-grained risk prediction. Across four independent clinical cohorts, PRISM consistently surpasses classical survival prediction models and state-of-the-art (SOTA) deep learning baselines under internal and external validation. Further clinical findings demonstrate that the combined imaging and EHR representations derived from PRISM provide valuable insights into cardiac risk across diverse cohorts. Three distinct imaging signatures associated with elevated MACE risk are uncovered, including lateral wall dyssynchrony, inferior wall hypersensitivity, and anterior elevated focus during diastole. Prompt-guided attribution further identifies hypertension, diabetes, and smoking as dominant contributors among clinical and physiological EHR factors.
