Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder

Junjie Zhou; Jiao Tang; Yingli Zuo; Peng Wan; Daoqiang Zhang; Wei Shao

Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder

Junjie Zhou, Jiao Tang, Yingli Zuo, Peng Wan, Daoqiang Zhang, Wei Shao

TL;DR

This work tackles survival prediction by integrating histopathology images and genomic data while addressing missing genomic modalities. It introduces LD-CVAE, a conditional latent differentiation variational autoencoder that learns function-specific genomic embeddings from gigapixel WSIs, augmented by a Variational Information Bottleneck Transformer to encode pathology efficiently. A product-of-experts framework fuses the pathology and reconstructed genomics into a joint latent distribution, guided by an alignment loss to improve cross-modality consistency, and a co-attention-based fusion yields survival predictions. Across five TCGA cancer cohorts, LD-CVAE outperforms unimodal and most multimodal baselines, and remains robust when genomic data are unavailable, highlighting its practical potential for real-world prognostic modeling.

Abstract

The integrative analysis of histopathological images and genomic data has received increasing attention for survival prediction of human cancers. However, the existing studies always hold the assumption that full modalities are available. As a matter of fact, the cost for collecting genomic data is high, which sometimes makes genomic data unavailable in testing samples. A common way of tackling such incompleteness is to generate the genomic representations from the pathology images. Nevertheless, such strategy still faces the following two challenges: (1) The gigapixel whole slide images (WSIs) are huge and thus hard for representation. (2) It is difficult to generate the genomic embeddings with diverse function categories in a unified generative framework. To address the above challenges, we propose a Conditional Latent Differentiation Variational AutoEncoder (LD-CVAE) for robust multimodal survival prediction, even with missing genomic data. Specifically, a Variational Information Bottleneck Transformer (VIB-Trans) module is proposed to learn compressed pathological representations from the gigapixel WSIs. To generate different functional genomic features, we develop a novel Latent Differentiation Variational AutoEncoder (LD-VAE) to learn the common and specific posteriors for the genomic embeddings with diverse functions. Finally, we use the product-of-experts technique to integrate the genomic common posterior and image posterior for the joint latent distribution estimation in LD-CVAE. We test the effectiveness of our method on five different cancer datasets, and the experimental results demonstrate its superiority in both complete and missing modality scenarios.

Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder

TL;DR

Abstract

Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)