Table of Contents
Fetching ...

Towards Personalized Multi-Modal MRI Synthesis across Heterogeneous Datasets

Yue Zhang, Zhizheng Zhuo, Siyao Xu, Shan Lv, Zhaoxi Liu, Jun Qiu, Qiuli Wang, Yaou Liu, S. Kevin Zhou

TL;DR

PMM-Synth is a personalized MRI synthesis framework that not only supports various synthesis tasks but also generalizes effectively across heterogeneous datasets, and holds potential for supporting reliable diagnosis under real-world modality-missing scenarios.

Abstract

Synthesizing missing modalities in multi-modal magnetic resonance imaging (MRI) is vital for ensuring diagnostic completeness, particularly when full acquisitions are infeasible due to time constraints, motion artifacts, and patient tolerance. Recent unified synthesis models have enabled flexible synthesis tasks by accommodating various input-output configurations. However, their training and evaluation are typically restricted to a single dataset, limiting their generalizability across diverse clinical datasets and impeding practical deployment. To address this limitation, we propose PMM-Synth, a personalized MRI synthesis framework that not only supports various synthesis tasks but also generalizes effectively across heterogeneous datasets. PMM-Synth is jointly trained on multiple multi-modal MRI datasets that differ in modality coverage, disease types, and intensity distributions. It achieves cross-dataset generalization through three core innovations: a Personalized Feature Modulation module that dynamically adapts feature representations based on dataset identifier to mitigate the impact of distributional shifts; a Modality-Consistent Batch Scheduler that facilitates stable and efficient batch training under inconsistent modality conditions; and a selective supervision loss to ensure effective learning when ground truth modalities are partially missing. Evaluated on four clinical multi-modal MRI datasets, PMM-Synth consistently outperforms state-of-the-art methods in both one-to-one and many-to-one synthesis tasks, achieving superior PSNR and SSIM scores. Qualitative results further demonstrate improved preservation of anatomical structures and pathological details. Additionally, downstream tumor segmentation and radiological reporting studies suggest that PMM-Synth holds potential for supporting reliable diagnosis under real-world modality-missing scenarios.

Towards Personalized Multi-Modal MRI Synthesis across Heterogeneous Datasets

TL;DR

PMM-Synth is a personalized MRI synthesis framework that not only supports various synthesis tasks but also generalizes effectively across heterogeneous datasets, and holds potential for supporting reliable diagnosis under real-world modality-missing scenarios.

Abstract

Synthesizing missing modalities in multi-modal magnetic resonance imaging (MRI) is vital for ensuring diagnostic completeness, particularly when full acquisitions are infeasible due to time constraints, motion artifacts, and patient tolerance. Recent unified synthesis models have enabled flexible synthesis tasks by accommodating various input-output configurations. However, their training and evaluation are typically restricted to a single dataset, limiting their generalizability across diverse clinical datasets and impeding practical deployment. To address this limitation, we propose PMM-Synth, a personalized MRI synthesis framework that not only supports various synthesis tasks but also generalizes effectively across heterogeneous datasets. PMM-Synth is jointly trained on multiple multi-modal MRI datasets that differ in modality coverage, disease types, and intensity distributions. It achieves cross-dataset generalization through three core innovations: a Personalized Feature Modulation module that dynamically adapts feature representations based on dataset identifier to mitigate the impact of distributional shifts; a Modality-Consistent Batch Scheduler that facilitates stable and efficient batch training under inconsistent modality conditions; and a selective supervision loss to ensure effective learning when ground truth modalities are partially missing. Evaluated on four clinical multi-modal MRI datasets, PMM-Synth consistently outperforms state-of-the-art methods in both one-to-one and many-to-one synthesis tasks, achieving superior PSNR and SSIM scores. Qualitative results further demonstrate improved preservation of anatomical structures and pathological details. Additionally, downstream tumor segmentation and radiological reporting studies suggest that PMM-Synth holds potential for supporting reliable diagnosis under real-world modality-missing scenarios.
Paper Structure (10 sections, 11 equations, 4 figures, 4 tables)

This paper contains 10 sections, 11 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of PMM-Synth. (a) Commonly used multi-modal brain MR imaging includes T1, T2, T1C, FLAIR, DWI, and ADC sequences. (b) Illustration of inter-dataset heterogeneity, which mainly includes modality inconsistency and distributional shifts. Modality inconsistency refers to the fact that different datasets cover different modality combinations, and cases within the same dataset may also contain varying subsets of modalities (e.g., the TTI dataset). Distributional shifts are exemplified using the FLAIR modality, where images from different datasets show clear differences in visual appearance and intensity distribution. (c) Architectural design of PMM-Synth. The framework includes three core components: the PFM to model dataset-specific distributions, the MCBS to enable efficient batch training under modality inconsistency, and the selective supervision loss to enable effective learning with partially missing ground truth. PMM-Synth improves the performance in downstream clinical tasks such as tumor segmentation and radiological reporting.
  • Figure 2: Comprehensive evaluation and ablation analysis of PMM-Synth across heterogeneous multi-modal MRI datasets.
  • Figure 3: Examples of cases with real and synthesized sequences. '3-to-3' refers to synthesizing T1C, DWI, and ADC from T1, T2, and FLAIR; '2-to-4' uses T1 and T2 to generate the remaining four modalities; '1-to-5' uses only T1 to synthesize the other five modalities. The synthesized modalities are highlighted with yellow dashed boxes.
  • Figure S1: The synthesis backbone adopted in this work is extended from our previous study. It comprises one common encoding stream, six modality-specific encoding streams, six decoding streams, and six discriminators, each specifically designed to process a corresponding imaging modality. A DFUM module is employed to integrate feature representations from a variable number of input modalities.