Table of Contents
Fetching ...

AdaMHF: Adaptive Multimodal Hierarchical Fusion for Survival Prediction

Shuaiyu Zhang, Xun Lin, Rongxiang Zhang, Yu Bai, Yong Xu, Tao Tan, Xunbin Zheng, Zitong Yu

TL;DR

AdaMHF addresses survival prediction from paired pathology images and genomic data under missing-modality conditions by introducing a modular, adaptive fusion framework. It combines Progressive Residual Experts Expansion (PREE) for heterogeneity-aware feature extraction with Adaptive Token Selection and Aggregation (ATSA) and a hierarchical Low-Rank Multimodal Fusion (LMF) to integrate local and global information efficiently. The authors also establish a missing-modality benchmark and demonstrate robust, state-of-the-art performance across five TCGA cancer cohorts, with strong results even when one modality is absent. Collectively, AdaMHF offers accurate, compute-efficient survival predictions suitable for real-world clinical settings where complete multimodal data are not always available.

Abstract

The integration of pathologic images and genomic data for survival analysis has gained increasing attention with advances in multimodal learning. However, current methods often ignore biological characteristics, such as heterogeneity and sparsity, both within and across modalities, ultimately limiting their adaptability to clinical practice. To address these challenges, we propose AdaMHF: Adaptive Multimodal Hierarchical Fusion, a framework designed for efficient, comprehensive, and tailored feature extraction and fusion. AdaMHF is specifically adapted to the uniqueness of medical data, enabling accurate predictions with minimal resource consumption, even under challenging scenarios with missing modalities. Initially, AdaMHF employs an experts expansion and residual structure to activate specialized experts for extracting heterogeneous and sparse features. Extracted tokens undergo refinement via selection and aggregation, reducing the weight of non-dominant features while preserving comprehensive information. Subsequently, the encoded features are hierarchically fused, allowing multi-grained interactions across modalities to be captured. Furthermore, we introduce a survival prediction benchmark designed to resolve scenarios with missing modalities, mirroring real-world clinical conditions. Extensive experiments on TCGA datasets demonstrate that AdaMHF surpasses current state-of-the-art (SOTA) methods, showcasing exceptional performance in both complete and incomplete modality settings.

AdaMHF: Adaptive Multimodal Hierarchical Fusion for Survival Prediction

TL;DR

AdaMHF addresses survival prediction from paired pathology images and genomic data under missing-modality conditions by introducing a modular, adaptive fusion framework. It combines Progressive Residual Experts Expansion (PREE) for heterogeneity-aware feature extraction with Adaptive Token Selection and Aggregation (ATSA) and a hierarchical Low-Rank Multimodal Fusion (LMF) to integrate local and global information efficiently. The authors also establish a missing-modality benchmark and demonstrate robust, state-of-the-art performance across five TCGA cancer cohorts, with strong results even when one modality is absent. Collectively, AdaMHF offers accurate, compute-efficient survival predictions suitable for real-world clinical settings where complete multimodal data are not always available.

Abstract

The integration of pathologic images and genomic data for survival analysis has gained increasing attention with advances in multimodal learning. However, current methods often ignore biological characteristics, such as heterogeneity and sparsity, both within and across modalities, ultimately limiting their adaptability to clinical practice. To address these challenges, we propose AdaMHF: Adaptive Multimodal Hierarchical Fusion, a framework designed for efficient, comprehensive, and tailored feature extraction and fusion. AdaMHF is specifically adapted to the uniqueness of medical data, enabling accurate predictions with minimal resource consumption, even under challenging scenarios with missing modalities. Initially, AdaMHF employs an experts expansion and residual structure to activate specialized experts for extracting heterogeneous and sparse features. Extracted tokens undergo refinement via selection and aggregation, reducing the weight of non-dominant features while preserving comprehensive information. Subsequently, the encoded features are hierarchically fused, allowing multi-grained interactions across modalities to be captured. Furthermore, we introduce a survival prediction benchmark designed to resolve scenarios with missing modalities, mirroring real-world clinical conditions. Extensive experiments on TCGA datasets demonstrate that AdaMHF surpasses current state-of-the-art (SOTA) methods, showcasing exceptional performance in both complete and incomplete modality settings.

Paper Structure

This paper contains 33 sections, 10 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The AdaMHF framework processes preprocessed genomic and pathlogic data through two identical units: the segregation and integration units, both using the PREE module to address heterogeneity. Outputs are combined via a hierarchical fusion mechanism to enhance predictive capability, with interaction facilitated by cross-attention module. Additionally, the ATSA module is specifically designed to ensure model efficiency and resolve sparsity in the data.
  • Figure 2: The specific structures of the PREE and ATSA modules: (a) The structure of PREE, where the expert is implemented as CNN for the pathological modality and as SNN for genomic data. (b) The detailed structure of ATSA, where the input consists of a class token and remaining patch-based tokens. Here, $B$ is the batch size, $N$ is the number of tokens, and $L$ is the length of each token.
  • Figure 3: Statistical analysis and ablation study: (a) Kaplan-Meier curves for statistical analysis of AdaMHF on the LUAD datasets. (b) Ablation study of AdaMHF key components over LUAD and BLCA datasets. (c) Computational complexity analysis.
  • Figure 5: Kaplan-Meier curves for statistical analysis of AdaMHF on the comprehensive TCGA datasets.