Table of Contents
Fetching ...

Modality-Projection Universal Model for Comprehensive Full-Body Medical Imaging Segmentation

Yixin Chen, Lin Gao, Yajuan Gao, Rui Wang, Jingge Lian, Xiangxi Meng, Yanhua Duan, Leiying Chai, Hongbin Han, Zhaoping Cheng, Zhaoheng Xie

TL;DR

The paper tackles the challenge of designing a universal segmentation model that operates across multiple medical-imaging modalities. It introduces the Modality Projection Universal Model (MPUM), which uses a modality projection controller to map a shared latent tissue representation into modality-specific feature spaces and dynamically generate convolutional kernels, enabling robust brain and whole-body segmentation without task-specific fine-tuning. MPUM demonstrates superior performance over state-of-the-art universal models on multi-modality segmentation, provides precise intracranial hemorrhage quantification for aided diagnosis, and enables whole-body metabolic analyses that reveal brain-body coupling in pediatric epilepsy, all while offering layer-wise saliency maps for improved interpretability. The framework integrates external pre-trained model features to stabilize latent-projection learning, enabling reliable cross-modality generalization and providing practical utility in emergency CT and epilepsy research settings. Overall, MPUM advances multi-task medical imaging by delivering accurate, interpretable segmentation across CT, MR, and PET with potential to streamline clinical workflows and support brain-body axis investigations.

Abstract

The integration of deep learning in medical imaging has shown great promise for enhancing diagnostic, therapeutic, and research outcomes. However, applying universal models across multiple modalities remains challenging due to the inherent variability in data characteristics. This study aims to introduce and evaluate a Modality Projection Universal Model (MPUM). MPUM employs a novel modality-projection strategy, which allows the model to dynamically adjust its parameters to optimize performance across different imaging modalities. The MPUM demonstrated superior accuracy in identifying anatomical structures, enabling precise quantification for improved clinical decision-making. It also identifies metabolic associations within the brain-body axis, advancing research on brain-body physiological correlations. Furthermore, MPUM's unique controller-based convolution layer enables visualization of saliency maps across all network layers, significantly enhancing the model's interpretability.

Modality-Projection Universal Model for Comprehensive Full-Body Medical Imaging Segmentation

TL;DR

The paper tackles the challenge of designing a universal segmentation model that operates across multiple medical-imaging modalities. It introduces the Modality Projection Universal Model (MPUM), which uses a modality projection controller to map a shared latent tissue representation into modality-specific feature spaces and dynamically generate convolutional kernels, enabling robust brain and whole-body segmentation without task-specific fine-tuning. MPUM demonstrates superior performance over state-of-the-art universal models on multi-modality segmentation, provides precise intracranial hemorrhage quantification for aided diagnosis, and enables whole-body metabolic analyses that reveal brain-body coupling in pediatric epilepsy, all while offering layer-wise saliency maps for improved interpretability. The framework integrates external pre-trained model features to stabilize latent-projection learning, enabling reliable cross-modality generalization and providing practical utility in emergency CT and epilepsy research settings. Overall, MPUM advances multi-task medical imaging by delivering accurate, interpretable segmentation across CT, MR, and PET with potential to streamline clinical workflows and support brain-body axis investigations.

Abstract

The integration of deep learning in medical imaging has shown great promise for enhancing diagnostic, therapeutic, and research outcomes. However, applying universal models across multiple modalities remains challenging due to the inherent variability in data characteristics. This study aims to introduce and evaluate a Modality Projection Universal Model (MPUM). MPUM employs a novel modality-projection strategy, which allows the model to dynamically adjust its parameters to optimize performance across different imaging modalities. The MPUM demonstrated superior accuracy in identifying anatomical structures, enabling precise quantification for improved clinical decision-making. It also identifies metabolic associations within the brain-body axis, advancing research on brain-body physiological correlations. Furthermore, MPUM's unique controller-based convolution layer enables visualization of saliency maps across all network layers, significantly enhancing the model's interpretability.

Paper Structure

This paper contains 25 sections, 10 equations, 12 figures.

Figures (12)

  • Figure 1: Overview of the modality projection universal model. a, Training process of the MPUM leveraging data from three distinct modalities. b, Comparison of two common multimodal data training strategies with our proposed modality-projection strategy. c, Application of the MPUM model as an aided identification tool across three modalities (over 500 categories). d, The MPUM model is utilized as an computer-aided diagnosis (CAD) tool for precise localization of intracranial hemorrhage with CT scans. e, Application of the MPUM Model as an aided analysis tool in identifying altered metabolic correlations in regions affected by epilepsy. f, Additional experimental results, including t-SNE visualizations of feature extraction operators and analysis of the network's saliency map.
  • Figure 2: Comparison of the performance between the MPUM and other advanced models. a, Dice Score and surface Dice of CT(left), MR(middle), and PET(right) imaging segmentation tasks. They compare four advanced models: UNet, CDUM, PCNet, and STUNet, with three multi-modality training strategies: mixed, specific, and projection. The modality-projection strategy represents the MPUM model. b, the table demonstrates the impact of using multi-modality training data on model performance.
  • Figure 3: Performance of the MPUM Framework as aided diagnosis tool. a, Utilization of the MPUM as an aided diagnosis tool to detect hemorrhages and map brain regions from CT head scans, facilitating a precise diagnosis automatically. b, Illustration of the impact of the MPUM framework on enhancing diagnostic accuracy and support to general doctor in real-world settings.
  • Figure 4: Multi-organ metabolic association analysis for pediatric epilepsy based on the universal model. We analyzed the metabolic associations in the epilepsy patient group (n=50) and the control group (n=22), using Fisher Z-Transformation to calculate the significance of differences in Pearson correlation coefficients. a&b, Schematic representation of the connectivity among brain regions associated with the Right Anterior Temporal Lobe Lateral Part and the right middle and inferior temporal gyrus. The left diagrams illustrate the strong metabolic connection within the control group. Notably, these correlations are statistically significantly reduced in the patient group (p<0.001). c, Metabolic connectivity between the Pallidum and Vertebrae T1-T12 affected by epilepsy.
  • Figure 5: Comprehensive visualization of saliency maps and feature operators. The black region displays a progression of saliency maps from shallow to deep layers, encompassing nine cases: a, a body CT scan, b-e, brain CT scans, f&g, PET scans, h&i, MR scans. The white region showcases the t-SNE visualization of convolutional kernel operators, capturing the feature extraction from shallow to deep layers.
  • ...and 7 more figures