Table of Contents
Fetching ...

MM-SurvNet: Deep Learning-Based Survival Risk Stratification in Breast Cancer Through Multimodal Data Fusion

Raktim Kumar Mondol, Ewan K. A. Millar, Arcot Sowmya, Erik Meijering

TL;DR

This work proposes a novel deep learning approach for survival risk stratification by integrating histopathological imaging, genetic and clinical data, and employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level.

Abstract

Survival risk stratification is an important step in clinical decision making for breast cancer management. We propose a novel deep learning approach for this purpose by integrating histopathological imaging, genetic and clinical data. It employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level. A dual cross-attention mechanism fuses these features with genetic data, while clinical data is incorporated at the final layer to enhance predictive accuracy. Experiments on the public TCGA-BRCA dataset show that our model, trained using the negative log likelihood loss function, can achieve superior performance with a mean C-index of 0.64, surpassing existing methods. This advancement facilitates tailored treatment strategies, potentially leading to improved patient outcomes.

MM-SurvNet: Deep Learning-Based Survival Risk Stratification in Breast Cancer Through Multimodal Data Fusion

TL;DR

This work proposes a novel deep learning approach for survival risk stratification by integrating histopathological imaging, genetic and clinical data, and employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level.

Abstract

Survival risk stratification is an important step in clinical decision making for breast cancer management. We propose a novel deep learning approach for this purpose by integrating histopathological imaging, genetic and clinical data. It employs vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level. A dual cross-attention mechanism fuses these features with genetic data, while clinical data is incorporated at the final layer to enhance predictive accuracy. Experiments on the public TCGA-BRCA dataset show that our model, trained using the negative log likelihood loss function, can achieve superior performance with a mean C-index of 0.64, surpassing existing methods. This advancement facilitates tailored treatment strategies, potentially leading to improved patient outcomes.
Paper Structure (11 sections, 5 equations, 5 figures, 1 table)

This paper contains 11 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Multimodal Data for Breast Cancer Characterization: This figure illustrates the diverse data types at multiple biological levels to comprehensively characterize breast cancer. It encompasses clinical data at the patient level, mammogram images at the organ level, histopathology images at the tissue level, and gene expression data (e.g., RNA Sequencing) at the molecular level, thereby providing a holistic view of the cancer's nature and behavior.
  • Figure 2: Preprocessing Methodology for Histopathology Images: Whole Slide Images (WSIs) are first annotated using the QuPath annotation tool by expert pathologists. From these annotated regions, non-overlapping patches are systematically extracted. Subsequently, a color transformation is applied to ensure uniformity in color representation across datasets.
  • Figure 3: The proposed MM-SurvNet architecture. It utilises a pretrained MaxViT to extract features from histopathology images, which are then processed through a self-attention network to aggregate patch-level features into a patient-level representation.
  • Figure 4: The proposed MM-SurvNet architecture employs a multimodal deep learning framework for survival risk prediction in cancer patients. Image embedding is concatenated with genetic data using dual cross-attention mechanisms and further integrated with clinical data in the final layer.
  • Figure 5: Mean survival curves aggregated from all five cross-validation (CV) tests with 95% confidence intervals. In the depicted curves, the log-rank test p$<$0.05.