Table of Contents
Fetching ...

Synergy vs. Noise: Performance-Guided Multimodal Fusion For Biochemical Recurrence-Free Survival in Prostate Cancer

Seth Alain Chang, Muhammad Mueez Amjad, Noorul Wahab, Ethar Alzaid, Nasir Rajpoot, Adam Shephard

TL;DR

The paper addresses whether adding more modalities to prognostic models in computational pathology always improves performance. It introduces a performance-guided, intermediate fusion framework that jointly leverages histopathology, radiology, and clinical data to predict time-to-biochemical recurrence in prostate cancer, using cross-attention and self-attention to model inter- and intra-modal dependencies. The key finding is that fusing high-performing modalities yields synergistic gains (e.g., Histopathology + Clinical reaching $C$-Index ≈ 0.835), while including a weak modality like radiology degrades accuracy, underscoring the need for modality screening prior to fusion. This work provides practical guidance for MDL system design in medical imaging, advocating selective modality inclusion and demonstrating a viable architecture for performance-guided multimodal prognostication. The results motivate validation on larger, diverse datasets and cross-task generalization to reinforce the proposed design principles.

Abstract

Multimodal deep learning (MDL) has emerged as a transformative approach in computational pathology. By integrating complementary information from multiple data sources, MDL models have demonstrated superior predictive performance across diverse clinical tasks compared to unimodal models. However, the assumption that combining modalities inherently improves performance remains largely unexamined. We hypothesise that multimodal gains depend critically on the predictive quality of individual modalities, and that integrating weak modalities may introduce noise rather than complementary information. We test this hypothesis on a prostate cancer dataset with histopathology, radiology, and clinical data to predict time-to-biochemical recurrence. Our results confirm that combining high-performing modalities yield superior performance compared to unimodal approaches. However, integrating a poor-performing modality with other higher-performing modalities degrades predictive accuracy. These findings demonstrate that multimodal benefit requires selective, performance-guided integration rather than indiscriminate modality combination, with implications for MDL design across computational pathology and medical imaging.

Synergy vs. Noise: Performance-Guided Multimodal Fusion For Biochemical Recurrence-Free Survival in Prostate Cancer

TL;DR

The paper addresses whether adding more modalities to prognostic models in computational pathology always improves performance. It introduces a performance-guided, intermediate fusion framework that jointly leverages histopathology, radiology, and clinical data to predict time-to-biochemical recurrence in prostate cancer, using cross-attention and self-attention to model inter- and intra-modal dependencies. The key finding is that fusing high-performing modalities yields synergistic gains (e.g., Histopathology + Clinical reaching -Index ≈ 0.835), while including a weak modality like radiology degrades accuracy, underscoring the need for modality screening prior to fusion. This work provides practical guidance for MDL system design in medical imaging, advocating selective modality inclusion and demonstrating a viable architecture for performance-guided multimodal prognostication. The results motivate validation on larger, diverse datasets and cross-task generalization to reinforce the proposed design principles.

Abstract

Multimodal deep learning (MDL) has emerged as a transformative approach in computational pathology. By integrating complementary information from multiple data sources, MDL models have demonstrated superior predictive performance across diverse clinical tasks compared to unimodal models. However, the assumption that combining modalities inherently improves performance remains largely unexamined. We hypothesise that multimodal gains depend critically on the predictive quality of individual modalities, and that integrating weak modalities may introduce noise rather than complementary information. We test this hypothesis on a prostate cancer dataset with histopathology, radiology, and clinical data to predict time-to-biochemical recurrence. Our results confirm that combining high-performing modalities yield superior performance compared to unimodal approaches. However, integrating a poor-performing modality with other higher-performing modalities degrades predictive accuracy. These findings demonstrate that multimodal benefit requires selective, performance-guided integration rather than indiscriminate modality combination, with implications for MDL design across computational pathology and medical imaging.

Paper Structure

This paper contains 14 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Overview of multimodal fusion strategies for predicting biochemical recurrence-free survival in prostate cancer. (a) Marginal intermediate fusion (dotted line): modality-specific features are concatenated without cross-modal interaction before prediction. (b) Joint intermediate fusion: features interact through cross-attention and self-attention layers to capture inter- and intra-modal dependencies prior to prediction.