Table of Contents
Fetching ...

Principled Feature Disentanglement for High-Fidelity Unified Brain MRI Synthesis

Jihoon Cho, Jonghye Woo, Jinah Park

TL;DR

This work tackles the problem of missing multimodal MRI sequences by introducing HF-GAN, a unified framework that enforces principled feature disentanglement. It separates complementary and modality-specific information via a hybrid-fusion encoder, fuses it into a common latent space with a channel-attention module, and uses a transformer-based modality infuser to synthesize target sequences. Across BraTS and IXI datasets, HF-GAN achieves state-of-the-art PSNR and SSIM, and its data-imputation capability substantially improves downstream brain-tumor segmentation performance. The approach demonstrates strong potential for clinical workflows by enabling robust synthesis across all missing-sequence scenarios with 2D-efficient architectures that maintain 3D-consistency through an intensity-encoding module. These results highlight the practical impact of disentangled representations for high-fidelity medical image synthesis and data augmentation.

Abstract

Multisequence Magnetic Resonance Imaging (MRI) provides a more reliable diagnosis in clinical applications through complementary information across sequences. However, in practice, the absence of certain MR sequences is a common problem that can lead to inconsistent analysis results. In this work, we propose a novel unified framework for synthesizing multisequence MR images, called hybrid-fusion GAN (HF-GAN). The fundamental mechanism of this work is principled feature disentanglement, which aligns the design of the architecture with the complexity of the features. A powerful many-to-one stream is constructed for the extraction of complex complementary features, while utilizing parallel, one-to-one streams to process modality-specific information. These disentangled features are dynamically integrated into a common latent space by a channel attention-based fusion module (CAFF) and then transformed via a modality infuser to generate the target sequence. We validated our framework on public datasets of both healthy and pathological brain MRI. Quantitative and qualitative results show that HF-GAN achieves state-of-the-art performance, with our 2D slice-based framework notably outperforming a leading 3D volumetric model. Furthermore, the utilization of HF-GAN for data imputation substantially improves the performance of the downstream brain tumor segmentation task, demonstrating its clinical relevance.

Principled Feature Disentanglement for High-Fidelity Unified Brain MRI Synthesis

TL;DR

This work tackles the problem of missing multimodal MRI sequences by introducing HF-GAN, a unified framework that enforces principled feature disentanglement. It separates complementary and modality-specific information via a hybrid-fusion encoder, fuses it into a common latent space with a channel-attention module, and uses a transformer-based modality infuser to synthesize target sequences. Across BraTS and IXI datasets, HF-GAN achieves state-of-the-art PSNR and SSIM, and its data-imputation capability substantially improves downstream brain-tumor segmentation performance. The approach demonstrates strong potential for clinical workflows by enabling robust synthesis across all missing-sequence scenarios with 2D-efficient architectures that maintain 3D-consistency through an intensity-encoding module. These results highlight the practical impact of disentangled representations for high-fidelity medical image synthesis and data augmentation.

Abstract

Multisequence Magnetic Resonance Imaging (MRI) provides a more reliable diagnosis in clinical applications through complementary information across sequences. However, in practice, the absence of certain MR sequences is a common problem that can lead to inconsistent analysis results. In this work, we propose a novel unified framework for synthesizing multisequence MR images, called hybrid-fusion GAN (HF-GAN). The fundamental mechanism of this work is principled feature disentanglement, which aligns the design of the architecture with the complexity of the features. A powerful many-to-one stream is constructed for the extraction of complex complementary features, while utilizing parallel, one-to-one streams to process modality-specific information. These disentangled features are dynamically integrated into a common latent space by a channel attention-based fusion module (CAFF) and then transformed via a modality infuser to generate the target sequence. We validated our framework on public datasets of both healthy and pathological brain MRI. Quantitative and qualitative results show that HF-GAN achieves state-of-the-art performance, with our 2D slice-based framework notably outperforming a leading 3D volumetric model. Furthermore, the utilization of HF-GAN for data imputation substantially improves the performance of the downstream brain tumor segmentation task, demonstrating its clinical relevance.
Paper Structure (20 sections, 19 equations, 9 figures, 5 tables)

This paper contains 20 sections, 19 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Illustration of our framework for synthesizing missing MR sequences. (left) Only accessible MR sequences are used to project into a common latent space. The projected feature representations are converted into the target latent space using the modality infuser, and a decoder generates target MR sequences based on the latent space of the feature representation. (right) If there are multiple accessible MR sequences, the complementary features extracted from the early fusion encoder are used.
  • Figure 2: An example of the channel attention-based feature fusion module $CAFF$ for four MR sequences, which are T1, T2, T1c, and FL. $CAFF$ has two main fusion paths. (orange arrow) The first path involves the integration of significant features to emphasize essential channels, with importance maps for each feature being computed by channel attention modules. (yellow arrow) The second path functions as a residual path, applying weights to the available MR sequences using recomputed importance maps derived from the softmax operation.
  • Figure 3: The structure of the modality infuser $MI$. The target modality is encoded by one-hot encoding.
  • Figure 4: An example of synthesized results for the missing MR sequence on the BraTS dataset. The first column represents the input MR sequences: T1, T2, FL, and T1c are located from the top-left in a clockwise direction. The target MR sequence (GT) is highlighted by a red border.
  • Figure 5: An example of synthesized results for the missing MR sequence on the IXI dataset. The first column represents the input MR sequences: T1, T2, and PD are located from the top in a clockwise direction. The target MR sequence (GT) is highlighted by a red border.
  • ...and 4 more figures