Table of Contents
Fetching ...

Unpaired Volumetric Harmonization of Brain MRI with Conditional Latent Diffusion

Mengqi Wu, Minhui Yu, Shuaiming Jing, Pew-Thian Yap, Zhengwu Zhang, Mingxia Liu

TL;DR

This work tackles the challenge of site-related variations in multi-site brain MRI by proposing HCLD, a unpaired, 3D harmonization framework that operates in a latent space. A pre-trained 3D autoencoder maps MRIs to a 4D latent representation, while a conditional latent diffusion model translates source latent maps to the target domain, conditioned on target style, enabling efficient volume-level harmonization without paired data. The method employs a two-stage training scheme, latent map fusion with AdaIN/IN, and content/style losses (including a Gram-based style loss), with DDIM-based inference for stability and speed. Across three datasets and three tasks, HCLD outperforms state-of-the-art methods in histogram alignment, site-age separation, and voxel-level fidelity, while maintaining anatomical integrity and facilitating downstream analyses. The approach offers a scalable, generalizable solution for harmonizing diverse MRI datasets, with potential extensions to additional sequences and clinically informed conditioning.

Abstract

Multi-site structural MRI is increasingly used in neuroimaging studies to diversify subject cohorts. However, combining MR images acquired from various sites/centers may introduce site-related non-biological variations. Retrospective image harmonization helps address this issue, but current methods usually perform harmonization on pre-extracted hand-crafted radiomic features, limiting downstream applicability. Several image-level approaches focus on 2D slices, disregarding inherent volumetric information, leading to suboptimal outcomes. To this end, we propose a novel 3D MRI Harmonization framework through Conditional Latent Diffusion (HCLD) by explicitly considering image style and brain anatomy. It comprises a generalizable 3D autoencoder that encodes and decodes MRIs through a 4D latent space, and a conditional latent diffusion model that learns the latent distribution and generates harmonized MRIs with anatomical information from source MRIs while conditioned on target image style. This enables efficient volume-level MRI harmonization through latent style translation, without requiring paired images from target and source domains during training. The HCLD is trained and evaluated on 4,158 T1-weighted brain MRIs from three datasets in three tasks, assessing its ability to remove site-related variations while retaining essential biological features. Qualitative and quantitative experiments suggest the effectiveness of HCLD over several state-of-the-arts

Unpaired Volumetric Harmonization of Brain MRI with Conditional Latent Diffusion

TL;DR

This work tackles the challenge of site-related variations in multi-site brain MRI by proposing HCLD, a unpaired, 3D harmonization framework that operates in a latent space. A pre-trained 3D autoencoder maps MRIs to a 4D latent representation, while a conditional latent diffusion model translates source latent maps to the target domain, conditioned on target style, enabling efficient volume-level harmonization without paired data. The method employs a two-stage training scheme, latent map fusion with AdaIN/IN, and content/style losses (including a Gram-based style loss), with DDIM-based inference for stability and speed. Across three datasets and three tasks, HCLD outperforms state-of-the-art methods in histogram alignment, site-age separation, and voxel-level fidelity, while maintaining anatomical integrity and facilitating downstream analyses. The approach offers a scalable, generalizable solution for harmonizing diverse MRI datasets, with potential extensions to additional sequences and clinically informed conditioning.

Abstract

Multi-site structural MRI is increasingly used in neuroimaging studies to diversify subject cohorts. However, combining MR images acquired from various sites/centers may introduce site-related non-biological variations. Retrospective image harmonization helps address this issue, but current methods usually perform harmonization on pre-extracted hand-crafted radiomic features, limiting downstream applicability. Several image-level approaches focus on 2D slices, disregarding inherent volumetric information, leading to suboptimal outcomes. To this end, we propose a novel 3D MRI Harmonization framework through Conditional Latent Diffusion (HCLD) by explicitly considering image style and brain anatomy. It comprises a generalizable 3D autoencoder that encodes and decodes MRIs through a 4D latent space, and a conditional latent diffusion model that learns the latent distribution and generates harmonized MRIs with anatomical information from source MRIs while conditioned on target image style. This enables efficient volume-level MRI harmonization through latent style translation, without requiring paired images from target and source domains during training. The HCLD is trained and evaluated on 4,158 T1-weighted brain MRIs from three datasets in three tasks, assessing its ability to remove site-related variations while retaining essential biological features. Qualitative and quantitative experiments suggest the effectiveness of HCLD over several state-of-the-arts
Paper Structure (34 sections, 15 equations, 9 figures, 4 tables)

This paper contains 34 sections, 15 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Illustration of the proposed HCLD framework. During training, it extracts latent feature maps from source and target MRIs using an encoder $\bm E$, fuses latent representations, and trains a conditional latent diffusion model (cLDM) to estimate the translated latent maps. During inference, it applies the trained cLDM to generate the final translated latent map by iterative denoising $T_s$ steps and then utilizes a decoder $\bm D$ to reconstruct the translated MRI. Both $\bm E$ and $\bm D$ are derived from an autoencoder pre-trained on 3,500 T1-weighted brain MRIs.
  • Figure 2: Results of histogram comparison on 11 sites from SRPBS (with the COI site as the target domain).
  • Figure 3: Log Wasserstein Distance (WD) box plots showing the alignment of the sources and target histograms from the SRPBS dataset.
  • Figure 4: Axial view (a) sample visualization results for SRPBS Subject 8 across 11 sites, and (b) difference map between each harmonized MRI and its ground truth for three SRPBS subjects ( i.e., Subject 2 from HUH, Subject 4 from SWA, and Subject 5 from KPM).
  • Figure 5: Result of volume-level metrics of six HCLD ablation variants on MRIs from SRPBS.
  • ...and 4 more figures