Table of Contents
Fetching ...

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

Yingxue Xu, Fengtao Zhou, Chenyu Zhao, Yihui Wang, Can Yang, Hao Chen

TL;DR

This work tackles incomplete multimodal survival prediction by introducing Distilled Prompt Learning (DisPro), a two-stage prompting framework that leverages Large Language Models to compensate for missing pathology and genomics data. In Stage 1, UniPro distills unimodal knowledge into learnable prompts for each modality, enabling modality-specific information to be captured despite data gaps. In Stage 2, MultiPro uses available modalities as prompts to infer the missing modality, while UniPro guidance regularizes the inferred representations to preserve modality-specific information. Extensive experiments on five TCGA cancer datasets show that DisPro outperforms state-of-the-art incomplete-modality methods and can rival complete-modality baselines, highlighting the practical potential of LLM-based prompting for robust survival prediction in clinical settings. The work provides code and demonstrates how staged prompt design can efficiently exploit multimodal data under realistic missingness scenarios.

Abstract

The integration of multimodal data including pathology images and gene profiles is widely applied in precise survival prediction. Despite recent advances in multimodal survival models, collecting complete modalities for multimodal fusion still poses a significant challenge, hindering their application in clinical settings. Current approaches tackling incomplete modalities often fall short, as they typically compensate for only a limited part of the knowledge of missing modalities. To address this issue, we propose a Distilled Prompt Learning framework (DisPro) to utilize the strong robustness of Large Language Models (LLMs) to missing modalities, which employs two-stage prompting for compensation of comprehensive information for missing modalities. In the first stage, Unimodal Prompting (UniPro) distills the knowledge distribution of each modality, preparing for supplementing modality-specific knowledge of the missing modality in the subsequent stage. In the second stage, Multimodal Prompting (MultiPro) leverages available modalities as prompts for LLMs to infer the missing modality, which provides modality-common information. Simultaneously, the unimodal knowledge acquired in the first stage is injected into multimodal inference to compensate for the modality-specific knowledge of the missing modality. Extensive experiments covering various missing scenarios demonstrated the superiority of the proposed method. The code is available at https://github.com/Innse/DisPro.

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

TL;DR

This work tackles incomplete multimodal survival prediction by introducing Distilled Prompt Learning (DisPro), a two-stage prompting framework that leverages Large Language Models to compensate for missing pathology and genomics data. In Stage 1, UniPro distills unimodal knowledge into learnable prompts for each modality, enabling modality-specific information to be captured despite data gaps. In Stage 2, MultiPro uses available modalities as prompts to infer the missing modality, while UniPro guidance regularizes the inferred representations to preserve modality-specific information. Extensive experiments on five TCGA cancer datasets show that DisPro outperforms state-of-the-art incomplete-modality methods and can rival complete-modality baselines, highlighting the practical potential of LLM-based prompting for robust survival prediction in clinical settings. The work provides code and demonstrates how staged prompt design can efficiently exploit multimodal data under realistic missingness scenarios.

Abstract

The integration of multimodal data including pathology images and gene profiles is widely applied in precise survival prediction. Despite recent advances in multimodal survival models, collecting complete modalities for multimodal fusion still poses a significant challenge, hindering their application in clinical settings. Current approaches tackling incomplete modalities often fall short, as they typically compensate for only a limited part of the knowledge of missing modalities. To address this issue, we propose a Distilled Prompt Learning framework (DisPro) to utilize the strong robustness of Large Language Models (LLMs) to missing modalities, which employs two-stage prompting for compensation of comprehensive information for missing modalities. In the first stage, Unimodal Prompting (UniPro) distills the knowledge distribution of each modality, preparing for supplementing modality-specific knowledge of the missing modality in the subsequent stage. In the second stage, Multimodal Prompting (MultiPro) leverages available modalities as prompts for LLMs to infer the missing modality, which provides modality-common information. Simultaneously, the unimodal knowledge acquired in the first stage is injected into multimodal inference to compensate for the modality-specific knowledge of the missing modality. Extensive experiments covering various missing scenarios demonstrated the superiority of the proposed method. The code is available at https://github.com/Innse/DisPro.

Paper Structure

This paper contains 21 sections, 12 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Insights for existing incomplete multimodal learning and comparison to the proposed method. (a) Generation-based Imputation and Imputation-free approaches, (b) Retrieved-based Imputation and (c) Ours.
  • Figure 2: Overview of DisPro. Stage 1: Unimodal Prompting aims to distill the knowledge distribution for each modality and prepare to supplement modality-specific knowledge for the missing modality. Stage 2: Multimodal Prompting utilizes the available modality as prompts to infer the representations of the missing one, compensating for modality-common knowledge. Simultaneously, the learned UniPro of Stage 1 supervises the learning of imputed representations to compensate for modality-specific knowledge. UniPro Scoring re-uses learned prompts to assist the LLM in capturing modality-common knowledge by selecting discriminative and relevant tokens.
  • Figure 3: Performance on various combinations under 60% training missing rate for different test scenarios.
  • Figure 4: Performance on various K of Top-K MaxPooling in UniPro Scoring.
  • Figure 5: The visualization of attention signals for each token in LLM used in (a) MAP and (b) DisPro (Ours).