Table of Contents
Fetching ...

Clinical-ComBAT: a diffusion-weighted MRI harmonization method for clinical applications

Gabriel Girard, Manon Edde, Félix Dumais, Yoan David, Matthieu Dumont, Guillaume Theaud, Jean-Christophe Houde, Arnaud Boré, Maxime Descoteaux, Pierre-Marc Jodoin

TL;DR

Clinical-ComBAT addresses cross-site biases in diffusion MRI by harmonizing each site to a large normative reference using a non-linear polynomial data model and site-specific regression parameters. It introduces a per-site harmonization framework with a single multiplicative variance term and a sitewise additive bias, stabilized by priors from the reference site and MAP estimation for moving sites. A goodness-of-fit QC with Bhattacharyya distance and an automatic hyperparameter-tuning scheme enable robust extrapolation, particularly in small-sample or evolving-clinic scenarios. Across real and synthetic datasets, Clinical-ComBAT outperforms ComBAT in aligning diffusion metrics (MD, FA, AFD) and supports incremental site integration for clinical deployment.

Abstract

Diffusion-weighted magnetic resonance imaging (DW-MRI) derived scalar maps are effective for assessing neurodegenerative diseases and microstructural properties of white matter in large number of brain conditions. However, DW-MRI inherently limits the combination of data from multiple acquisition sites without harmonization to mitigate scanner-specific biases. While the widely used ComBAT method reduces site effects in research, its reliance on linear covariate relationships, homogeneous populations, fixed site numbers, and well populated sites constrains its clinical use. To overcome these limitations, we propose Clinical-ComBAT, a method designed for real-world clinical scenarios. Clinical-ComBAT harmonizes each site independently, enabling flexibility as new data and clinics are introduced. It incorporates a non-linear polynomial data model, site-specific harmonization referenced to a normative site, and variance priors adaptable to small cohorts. It further includes hyperparameter tuning and a goodness-of-fit metric for harmonization assessment. We demonstrate its effectiveness on simulated and real data, showing improved alignment of diffusion metrics and enhanced applicability for normative modeling.

Clinical-ComBAT: a diffusion-weighted MRI harmonization method for clinical applications

TL;DR

Clinical-ComBAT addresses cross-site biases in diffusion MRI by harmonizing each site to a large normative reference using a non-linear polynomial data model and site-specific regression parameters. It introduces a per-site harmonization framework with a single multiplicative variance term and a sitewise additive bias, stabilized by priors from the reference site and MAP estimation for moving sites. A goodness-of-fit QC with Bhattacharyya distance and an automatic hyperparameter-tuning scheme enable robust extrapolation, particularly in small-sample or evolving-clinic scenarios. Across real and synthetic datasets, Clinical-ComBAT outperforms ComBAT in aligning diffusion metrics (MD, FA, AFD) and supports incremental site integration for clinical deployment.

Abstract

Diffusion-weighted magnetic resonance imaging (DW-MRI) derived scalar maps are effective for assessing neurodegenerative diseases and microstructural properties of white matter in large number of brain conditions. However, DW-MRI inherently limits the combination of data from multiple acquisition sites without harmonization to mitigate scanner-specific biases. While the widely used ComBAT method reduces site effects in research, its reliance on linear covariate relationships, homogeneous populations, fixed site numbers, and well populated sites constrains its clinical use. To overcome these limitations, we propose Clinical-ComBAT, a method designed for real-world clinical scenarios. Clinical-ComBAT harmonizes each site independently, enabling flexibility as new data and clinics are introduced. It incorporates a non-linear polynomial data model, site-specific harmonization referenced to a normative site, and variance priors adaptable to small cohorts. It further includes hyperparameter tuning and a goodness-of-fit metric for harmonization assessment. We demonstrate its effectiveness on simulated and real data, showing improved alignment of diffusion metrics and enhanced applicability for normative modeling.

Paper Structure

This paper contains 28 sections, 57 equations, 11 figures, 1 table, 4 algorithms.

Figures (11)

  • Figure 1: Clinical-ComBAT harmonization process for aligning a moving site to a reference site. a) Raw data distributions from the reference (black) and moving (green) sites. b) data after Clinical-ComBAT harmonization. c) Step-by-step illustration of the Clinical-ComBAT procedure. Scatter plots show the reference (black) and moving (green) data. Step 1 consists of fitting a regression model to both the reference and moving site data following Eq.(\ref{['eq:betaRv']}) and (\ref{['eq:betaMv']}). In Step 2, the site-specific covariate effects and intercept are removed from the moving site data. Step 3 applies a variance correction to align the data dispersion with that of the reference site. In Step 4, the adjusted data are transformed to match the reference site distribution. Steps 2,3, and 4 derive from Eq.(\ref{['eq:Clinical-ComBAT_data']}).
  • Figure 2: Example of two model fits using a third-degree polynomial ($P=3$). Black dots and their fitted curve correspond to the reference Cam-CAN dataset, while red dots represent 10 synthetic moving data points. Left: the moving model with $\vec{\lambda}=0$ overfits the data and diverges outside the 60–80 age range. Right: the auto-tuned model ($\vec{\lambda}=32762$) does not deviate too much from the reference model.
  • Figure 3: Harmonization of mean diffusivity (MD) using Clinical-ComBAT and ComBAT for the NIMH site (green), and Track-TBI site A (blue) and site B (cyan). The top three rows show harmonized white matter skeleton masks with corresponding $D_B$ values, where orange dots indicate TBI subjects. The bottom row presents stacked histograms of $D_B$ across all white matter bundles and sites. Lower $D_B$ values indicate closer alignment with the reference site, underscoring both the necessity of harmonization and the superior performance of Clinical-ComBAT over ComBAT.
  • Figure 4: Synthetic mean diffusivity (MD) data harmonization performance of ComBAT and Clinical-ComBAT. a–c) Distributions of target and moving site data before and after harmonization under different simulated biases: a) slope bias; b) combined multiplicative and additive biases; c) combined slope, multiplicative, and additive biases. d) Root Mean Squared Error (RMSE) of ComBAT and Clinical-ComBAT for increasing slope and multiplicative bias.
  • Figure 5: Harmonization performance of ComBAT and Clinical-ComBAT on synthetic mean diffusivity (MD) data of the white matter skeleton mask, with limited amount of training data. The test set comprises 100 randomly selected subjects, and the remaining 341 subjects are used to generate training sets. a) Target site data (black) and raw moving site data (red) under simulated biases: slope $S = 0.50$, multiplicative $M = 1.25$, and additive $A = 1.10$. b) Root Mean Squared Error (RMSE) across increasing numbers of randomly sampled training subjects. Each condition was repeated 30 times. The curves show mean RMSE and shaded areas represent one standard deviation.
  • ...and 6 more figures