Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework

Mengqi Wu; Lintao Zhang; Pew-Thian Yap; Hongtu Zhu; Mingxia Liu

Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework

Mengqi Wu, Lintao Zhang, Pew-Thian Yap, Hongtu Zhu, Mingxia Liu

TL;DR

This work tackles non-biological site effects in multi-site structural MRI by introducing DLEST, a disentangled latent energy-based style translation framework for unpaired image-level harmonization. DLEST separates image generation (SIG) from style translation (SST) in a low-dimensional latent space and uses a lightweight energy-based model with Langevin dynamics (SGLD) to align source latent codes with target-site distributions, followed by site-specific MRI synthesis (SMS). The approach demonstrates improved histogram alignment, reduced site-classification signals, and enhanced segmentation performance, while enabling efficient adaptation to new sites with minimal retraining. By combining latent-space generation, energy-based style transfer, and synthesis, DLEST offers a scalable, generalizable preprocessing step and data-augmentation tool for multi-site MRI analyses.

Abstract

Brain magnetic resonance imaging (MRI) has been extensively employed across clinical and research fields, but often exhibits sensitivity to site effects arising from non-biological variations such as differences in field strength and scanner vendors. Numerous retrospective MRI harmonization techniques have demonstrated encouraging outcomes in reducing the site effects at the image level. However, existing methods generally suffer from high computational requirements and limited generalizability, restricting their applicability to unseen MRIs. In this paper, we design a novel disentangled latent energy-based style translation (DLEST) framework for unpaired image-level MRI harmonization, consisting of (a) site-invariant image generation (SIG), (b) site-specific style translation (SST), and (c) site-specific MRI synthesis (SMS). Specifically, the SIG employs a latent autoencoder to encode MRIs into a low-dimensional latent space and reconstruct MRIs from latent codes. The SST utilizes an energy-based model to comprehend the global latent distribution of a target domain and translate source latent codes toward the target domain, while SMS enables MRI synthesis with a target-specific style. By disentangling image generation and style translation in latent space, the DLEST can achieve efficient style translation. Our model was trained on T1-weighted MRIs from a public dataset (with 3,984 subjects across 58 acquisition sites/settings) and validated on an independent dataset (with 9 traveling subjects scanned in 11 sites/settings) in four tasks: histogram and feature visualization, site classification, brain tissue segmentation, and site-specific structural MRI synthesis. Qualitative and quantitative results demonstrate the superiority of our method over several state-of-the-arts.

Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework

TL;DR

Abstract

Paper Structure (31 sections, 10 equations, 10 figures, 3 tables)

This paper contains 31 sections, 10 equations, 10 figures, 3 tables.

Introduction
Related Work
Image-Level MRI Harmonization
Energy-Based Models for Image Generation
Materials and Methodology
Materials and Image Preprocessing
Datasets
Data Preprocessing
Proposed Methodology
Problem Formulation
Site-Invariant Image Generation (SIG)
Site-Specific Style Translation (SST)
Site-Specific MRI Synthesis (SMS)
Implementation
Experiments
...and 16 more sections

Figures (10)

Figure 1: The proposed disentangled latent energy-based style translation (DLEST) framework for MRI harmonization at the image level, consisting of (a) site-invariant image generation that encodes images into low-dimensional latent space and reconstruct images based on latent codes, (b) site-specific style translation that facilitates implicit style translation within the latent space, and (c) site-specific MRI synthesis that generates diverse synthetic MRIs with a given target site style.
Figure 2: Histogram comparison of 10 source sites and a target site (COI) across all 9 traveling subjects from the SRPBS dataset. The first plot shows pre-harmonization histograms, while the subsequent plots depict post-harmonization histograms by each competing method and our method DLEST, respectively. WM: white matter; GM: gray matter; CSF: cerebrospinal fluid.
Figure 3: Visualization of MRI features of 11 sites across 9 subjects from SRPBS, with MRI (a) before and (b) after harmonization by DLEST. Each color denotes a specific site, while each point denotes extracted features from an MRI slice of a specific subject.
Figure 4: Segmentation results of CSF, GM, and WM on SRPBS in terms of (left) Dice coefficient and (right) Jaccard index. Each U-Net is trained on the target site COI and validated on the source site HUH from SRPBS.
Figure 5: Segmentation of brain tissues ( i.e., CSF, GM, WM) using unharmonized MRI data (Baseline) and data harmonized by CycleGAN (top competing method) and DLEST (ours). White pixels denote the correct segmentation, red pixels denote under-segmentation, and blue pixels denote over-segmentation.
...and 5 more figures

Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework

TL;DR

Abstract

Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework

Authors

TL;DR

Abstract

Table of Contents

Figures (10)