Table of Contents
Fetching ...

EsurvFusion: An evidential multimodal survival fusion model based on Gaussian random fuzzy numbers

Ling Huang, Yucheng Xing, Qika Lin, Su Ruan, Mengling Feng

TL;DR

EsurvFusion tackles multimodal survival analysis under censoring by modeling each modality with Gaussian random fuzzy numbers to quantify both aleatoric and epistemic uncertainty, then learning modality reliability through a discounting mechanism and fusing predictions at the decision level with an evidential fusion layer. The approach introduces GRNFs, reliability discounting, and an evidence-based fusion strategy, optimized by a hybrid loss that balances unimodal and multimodal evidence via belief and plausibility. Empirical results on four cancer datasets show superior C-index and improved calibration (lower IBS and IBLL) over state-of-the-art baselines, while providing interpretable modality contributions and uncertainty estimates. This work advances reliable and transparent multimodal survival analysis and suggests broader use of evidential uncertainty in clinical time-to-event prediction, with future work aimed at scaling to larger datasets and deeper architectures.

Abstract

Multimodal survival analysis aims to combine heterogeneous data sources (e.g., clinical, imaging, text, genomics) to improve the prediction quality of survival outcomes. However, this task is particularly challenging due to high heterogeneity and noise across data sources, which vary in structure, distribution, and context. Additionally, the ground truth is often censored (uncertain) due to incomplete follow-up data. In this paper, we propose a novel evidential multimodal survival fusion model, EsurvFusion, designed to combine multimodal data at the decision level through an evidence-based decision fusion layer that jointly addresses both data and model uncertainty while incorporating modality-level reliability. Specifically, EsurvFusion first models unimodal data with newly introduced Gaussian random fuzzy numbers, producing unimodal survival predictions along with corresponding aleatoric and epistemic uncertainties. It then estimates modality-level reliability through a reliability discounting layer to correct the misleading impact of noisy data modalities. Finally, a multimodal evidence-based fusion layer is introduced to combine the discounted predictions to form a unified, interpretable multimodal survival analysis model, revealing each modality's influence based on the learned reliability coefficients. This is the first work that studies multimodal survival analysis with both uncertainty and reliability. Extensive experiments on four multimodal survival datasets demonstrate the effectiveness of our model in handling high heterogeneity data, establishing new state-of-the-art on several benchmarks.

EsurvFusion: An evidential multimodal survival fusion model based on Gaussian random fuzzy numbers

TL;DR

EsurvFusion tackles multimodal survival analysis under censoring by modeling each modality with Gaussian random fuzzy numbers to quantify both aleatoric and epistemic uncertainty, then learning modality reliability through a discounting mechanism and fusing predictions at the decision level with an evidential fusion layer. The approach introduces GRNFs, reliability discounting, and an evidence-based fusion strategy, optimized by a hybrid loss that balances unimodal and multimodal evidence via belief and plausibility. Empirical results on four cancer datasets show superior C-index and improved calibration (lower IBS and IBLL) over state-of-the-art baselines, while providing interpretable modality contributions and uncertainty estimates. This work advances reliable and transparent multimodal survival analysis and suggests broader use of evidential uncertainty in clinical time-to-event prediction, with future work aimed at scaling to larger datasets and deeper architectures.

Abstract

Multimodal survival analysis aims to combine heterogeneous data sources (e.g., clinical, imaging, text, genomics) to improve the prediction quality of survival outcomes. However, this task is particularly challenging due to high heterogeneity and noise across data sources, which vary in structure, distribution, and context. Additionally, the ground truth is often censored (uncertain) due to incomplete follow-up data. In this paper, we propose a novel evidential multimodal survival fusion model, EsurvFusion, designed to combine multimodal data at the decision level through an evidence-based decision fusion layer that jointly addresses both data and model uncertainty while incorporating modality-level reliability. Specifically, EsurvFusion first models unimodal data with newly introduced Gaussian random fuzzy numbers, producing unimodal survival predictions along with corresponding aleatoric and epistemic uncertainties. It then estimates modality-level reliability through a reliability discounting layer to correct the misleading impact of noisy data modalities. Finally, a multimodal evidence-based fusion layer is introduced to combine the discounted predictions to form a unified, interpretable multimodal survival analysis model, revealing each modality's influence based on the learned reliability coefficients. This is the first work that studies multimodal survival analysis with both uncertainty and reliability. Extensive experiments on four multimodal survival datasets demonstrate the effectiveness of our model in handling high heterogeneity data, establishing new state-of-the-art on several benchmarks.

Paper Structure

This paper contains 30 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the evidential multimodal survival fusion (EsurvFusion) model. It is composed of several unimodal ENNreg modules that predict survival evidence at the modality level, along with evidence discounting layers that learn the reliability of each modality, and a multimodal decision fusion layer that combines evidence from all modalities. PE is the prototype embedding layer, EM is the evidence mapping layer, and EF is the intermodality prototype-based evidence fusion layer. We visualized the contour functions of the output from each module to better illustrate changes in prediction evidence.
  • Figure 2: HECKTOR data distribution visualization (via t-SNE): (a) Clinical data distribution. (b) Radiomic data distribution. (c) Combined (clinical and radiomic) data distribution using data concatenation. (d) Combined (clinical and radiomic) feature distribution using an attention mechanism. The color indicates the distribution density, with yellow representing the highest density.
  • Figure 3: Comparison of the risk stratification performance among different survival methods on HECKTOR2022. High-risk (blue) and low-risk (orange) groups are identified based on the median predicted risk. The Log-rank test was used to determine the statistical significance ($\alpha$ = 0.05).
  • Figure 4: The visualization of survival predictions (GRFNs) for two Head&Neck cancer patients, with input features mapped via t-SNE. For each patient, the transformed possible survival times (GRFNs) are represented by the Gaussian random variable (GRV) $N(\mu, \sigma^2 | x)$ along the x-axis, with the most plausible survival time $\mu(x)$ marked by a red arrow. The less centralized the GRV, the higher the $\sigma^2$, and the higher the aleatory uncertainty. The membership function, tied to precision $h$, is displayed for each GRV. The wider the membership function, the less precision and higher epistemic uncertainty. Color intensity indicates evidence degree.