CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction

Rustin Soraki; Huayu Wang; Joann G. Elmore; Linda Shapiro

CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction

Rustin Soraki, Huayu Wang, Joann G. Elmore, Linda Shapiro

TL;DR

Cancer survival prediction from whole slide images is challenging due to enormous size and tissue heterogeneity. CrossFusion introduces a multi-scale cross-attention framework that fuses patches from 5x, 10x, and 20x magnifications through Cross-Attention Block, Pad-Transformer, and Conv Processor, producing a prediction token for prognosis; hazards h are derived from logits l via $\mathbf{h}=\sigma(\mathbf{l})$ and survival is $\mathbf{S}=\prod (1-\mathbf{h})$. The approach achieves state-of-the-art or near state-of-the-art performance across six TCGA cancer types, with interpretable heatmaps showing region-specific decisions and clear gains when using domain-specific feature backbones such as Uni2-h. These results demonstrate CrossFusion's potential to improve prognostication and support personalized cancer treatment, while maintaining interpretability and enabling future multimodal extensions. The accompanying code availability further facilitates reproducibility and adoption in clinical research.

Abstract

Cancer survival prediction from whole slide images (WSIs) is a challenging task in computational pathology due to the large size, irregular shape, and high granularity of the WSIs. These characteristics make it difficult to capture the full spectrum of patterns, from subtle cellular abnormalities to complex tissue interactions, which are crucial for accurate prognosis. To address this, we propose CrossFusion, a novel multi-scale feature integration framework that extracts and fuses information from patches across different magnification levels. By effectively modeling both scale-specific patterns and their interactions, CrossFusion generates a rich feature set that enhances survival prediction accuracy. We validate our approach across six cancer types from public datasets, demonstrating significant improvements over existing state-of-the-art methods. Moreover, when coupled with domain-specific feature extraction backbones, our method shows further gains in prognostic performance compared to general-purpose backbones. The source code is available at: https://github.com/RustinS/CrossFusion

CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction

TL;DR

Abstract

CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)