Table of Contents
Fetching ...

Glioma Multimodal MRI Analysis System for Tumor Layered Diagnosis via Multi-task Semi-supervised Learning

Yihao Liu, Zhihao Cui, Liming Li, Junjie You, Xinle Feng, Jianxin Wang, Xiangyu Wang, Qing Liu, Minghua Wu

TL;DR

GMMAS presents a multimodal MRI framework for glioma layered diagnosis that jointly performs tumor segmentation and histological/molecular subtyping using an uncertainty-weighted multi-task loss and a two-stage semi-supervised learning approach. The architecture combines CNN, Transformer, and U-Net components with learnable modality fusion, an adaptation module for missing modalities, and a Tumor-CutMix data augmentation strategy to improve robustness and calibration. The system achieves state-of-the-art performance across segmentation and biomarker prediction tasks, demonstrates strong adaptation to absent MRI modalities, and is paired with a GMMAS-GPT platform that generates personalized prognostic reports. This work advances clinically practical, integrated AI for glioma evaluation and paves the way toward broader adoption in multimodal neuro-oncology workflows.

Abstract

Gliomas are the most common primary tumors of the central nervous system. Multimodal MRI is widely used for the preliminary screening of gliomas and plays a crucial role in auxiliary diagnosis, therapeutic efficacy, and prognostic evaluation. Currently, the computer-aided diagnostic studies of gliomas using MRI have focused on independent analysis events such as tumor segmentation, grading, and radiogenomic classification, without studying inter-dependencies among these events. In this study, we propose a Glioma Multimodal MRI Analysis System (GMMAS) that utilizes a deep learning network for processing multiple events simultaneously, leveraging their inter-dependencies through an uncertainty-based multi-task learning architecture and synchronously outputting tumor region segmentation, glioma histological subtype, IDH mutation genotype, and 1p/19q chromosome disorder status. Compared with the reported single-task analysis models, GMMAS improves the precision across tumor layered diagnostic tasks. Additionally, we have employed a two-stage semi-supervised learning method, enhancing model performance by fully exploiting both labeled and unlabeled MRI samples. Further, by utilizing an adaptation module based on knowledge self-distillation and contrastive learning for cross-modal feature extraction, GMMAS exhibited robustness in situations of modality absence and revealed the differing significance of each MRI modal. Finally, based on the analysis outputs of the GMMAS, we created a visual and user-friendly platform for doctors and patients, introducing GMMAS-GPT to generate personalized prognosis evaluations and suggestions.

Glioma Multimodal MRI Analysis System for Tumor Layered Diagnosis via Multi-task Semi-supervised Learning

TL;DR

GMMAS presents a multimodal MRI framework for glioma layered diagnosis that jointly performs tumor segmentation and histological/molecular subtyping using an uncertainty-weighted multi-task loss and a two-stage semi-supervised learning approach. The architecture combines CNN, Transformer, and U-Net components with learnable modality fusion, an adaptation module for missing modalities, and a Tumor-CutMix data augmentation strategy to improve robustness and calibration. The system achieves state-of-the-art performance across segmentation and biomarker prediction tasks, demonstrates strong adaptation to absent MRI modalities, and is paired with a GMMAS-GPT platform that generates personalized prognostic reports. This work advances clinically practical, integrated AI for glioma evaluation and paves the way toward broader adoption in multimodal neuro-oncology workflows.

Abstract

Gliomas are the most common primary tumors of the central nervous system. Multimodal MRI is widely used for the preliminary screening of gliomas and plays a crucial role in auxiliary diagnosis, therapeutic efficacy, and prognostic evaluation. Currently, the computer-aided diagnostic studies of gliomas using MRI have focused on independent analysis events such as tumor segmentation, grading, and radiogenomic classification, without studying inter-dependencies among these events. In this study, we propose a Glioma Multimodal MRI Analysis System (GMMAS) that utilizes a deep learning network for processing multiple events simultaneously, leveraging their inter-dependencies through an uncertainty-based multi-task learning architecture and synchronously outputting tumor region segmentation, glioma histological subtype, IDH mutation genotype, and 1p/19q chromosome disorder status. Compared with the reported single-task analysis models, GMMAS improves the precision across tumor layered diagnostic tasks. Additionally, we have employed a two-stage semi-supervised learning method, enhancing model performance by fully exploiting both labeled and unlabeled MRI samples. Further, by utilizing an adaptation module based on knowledge self-distillation and contrastive learning for cross-modal feature extraction, GMMAS exhibited robustness in situations of modality absence and revealed the differing significance of each MRI modal. Finally, based on the analysis outputs of the GMMAS, we created a visual and user-friendly platform for doctors and patients, introducing GMMAS-GPT to generate personalized prognosis evaluations and suggestions.

Paper Structure

This paper contains 19 sections, 11 equations, 13 figures, 7 tables, 1 algorithm.

Figures (13)

  • Figure 1: Data preprocessing diagram. Skull stripping of MRI brain images removes the skull portion, leaving only the brain tissue for analysis. Data augmentation techniques are applied to the stripped images, which involve various transformations such as cropping and flipping to increase the diversity of the dataset. "Tumor-CutMix," a technique for creating composite images. Segments of the tumor region from one class (e.g., LGG [1,0]) are combined with tumor regions of another class (e.g., GBM [0,1]) to generate a blended tumor image, with the corresponding labels indicating the proportion of each class in the composite (e.g., [0.6,0.4]).
  • Figure 2: Overview of GMMAS architecture. Feature extraction begins with MRI inputs. Each modal of MRI images undergoes a transformation where features are extracted and subsequently averaged, culminating in a weighted average representation. This representation is then refined through a series of transformer layers. The refined features are passed to a segmentation decoder, which outputs the tumor segmentation on the MRI image. In parallel, the features are also fed into a multi-task classifier for glioma histological and molecular subtyping tasks. In addition, a U-net with down-sampled full MRI images as inputs was used to extract global image features to alleviate the limitations of the original patch learning. Global features of different abstract layers were merged with backbone outputs using a channel attention-based feature fusion module, as shown below. GAP: global average pooling, GMP: global max pooling.
  • Figure 3: Illustration of the first stage of adaptation module. A dual-pathway knowledge self-distillation structure. The top pathway processes full-modal inputs, while the bottom pathway handles cases where one or two modalities are randomly omitted, simulating scenarios with incomplete data. Mean squared error (MSE) loss from features between pathways are used to help the model learn cross-modal features and maintain performance. Dice losses from the ground truth labels are used to refine the segmentation accuracy of every pathway. (b) A contrastive learning Siamese network for cross-modal feature extraction in MRI Images.
  • Figure 4: Illustration of the second stage of adaptation module. A contrastive learning Siamese network for cross-modal feature extraction in MRI Images.
  • Figure 5: Calibration curves (a) before and (b) after introducing the (c) Tumor-CutMix strategy ECE values versus prediction epistemic uncertainty values of the uncalibrated and calibrated models.
  • ...and 8 more figures