Unified Multi-Modal Image Synthesis for Missing Modality Imputation
Yue Zhang, Chengtao Peng, Qiuli Wang, Dan Song, Kaiyan Li, S. Kevin Zhou
TL;DR
This work tackles missing modalities in multi-modal MRI by proposing a unified, single-model synthesis framework that imputes absent contrasts from arbitrary combinations of available inputs. It introduces a Commonality- and Discrepancy-Sensitive Encoder (CDS-Encoder) to exploit modality-invariant and modality-specific information, and a Dynamic Feature Unification Module (DFUM) to robustly fuse features from varying modality sets, using both hard and soft integration. A curriculum-learning strategy guides training across easy-to-hard missingness scenarios, and four discriminators enforce realistic, high-frequency detail through PatchGAN-based adversarial losses. Extensive experiments on BraTS and IXI demonstrate superior quantitative and qualitative performance across one-to-one, many-to-one, and unified synthesis tasks, along with downstream gains in tumor segmentation and cross-plane consistency. The results suggest a practical impact for missing-data imputation in clinical pipelines and potential extension to 3D volumetric synthesis, albeit with considerations for memory and cross-dataset generalizability.
Abstract
Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
