Table of Contents
Fetching ...

ComBAT Harmonization for diffusion MRI: Challenges and Best Practices

Pierre-Marc Jodoin, Manon Edde, Gabriel Girard, Félix Dumais, Guillaume Theaud, Matthieu Dumont, Jean-Christophe Houde, Yoan David, Maxime Descoteaux

TL;DR

ComBAT harmonization for diffusion MRI is analyzed with focus on its linear data-generation model $y_{ijv}=\alpha_v+\mathbf{x}_{ij}^T\bm{\beta}_v+\gamma_{iv}+\delta_{iv}\varepsilon_{ijv}$ and the critical assumption that $\bm{\beta}_v$ is identical across sites. The paper shows that violations, especially a site-dependent slope induced by a multiplicative factor $S_i$ or biased variance $\delta_{iv}^2$, can lead to poor harmonization. To address this, it introduces Pairwise-ComBAT, which harmonizes each moving site to a fixed reference site and uses a goodness-of-fit metric based on the Bhattacharyya distance to quantify overlap. Through experiments on CamCAN, Modified-CamCAN, ADNI, and NIMH, it derives practical recommendations for data inspection, covariate inclusion, sample size, age range, sex balance, and handling pathological populations, to improve reproducibility and clinical applicability.

Abstract

Over the years, ComBAT has become the standard method for harmonizing MRI-derived measurements, with its ability to compensate for site-related additive and multiplicative biases while preserving biological variability. However, ComBAT relies on a set of assumptions that, when violated, can result in flawed harmonization. In this paper, we thoroughly review ComBAT's mathematical foundation, outlining these assumptions, and exploring their implications for the demographic composition necessary for optimal results. Through a series of experiments involving a slightly modified version of ComBAT called Pairwise-ComBAT tailored for normative modeling applications, we assess the impact of various population characteristics, including population size, age distribution, the absence of certain covariates, and the magnitude of additive and multiplicative factors. Based on these experiments, we present five essential recommendations that should be carefully considered to enhance consistency and supporting reproducibility, two essential factors for open science, collaborative research, and real-life clinical deployment.

ComBAT Harmonization for diffusion MRI: Challenges and Best Practices

TL;DR

ComBAT harmonization for diffusion MRI is analyzed with focus on its linear data-generation model and the critical assumption that is identical across sites. The paper shows that violations, especially a site-dependent slope induced by a multiplicative factor or biased variance , can lead to poor harmonization. To address this, it introduces Pairwise-ComBAT, which harmonizes each moving site to a fixed reference site and uses a goodness-of-fit metric based on the Bhattacharyya distance to quantify overlap. Through experiments on CamCAN, Modified-CamCAN, ADNI, and NIMH, it derives practical recommendations for data inspection, covariate inclusion, sample size, age range, sex balance, and handling pathological populations, to improve reproducibility and clinical applicability.

Abstract

Over the years, ComBAT has become the standard method for harmonizing MRI-derived measurements, with its ability to compensate for site-related additive and multiplicative biases while preserving biological variability. However, ComBAT relies on a set of assumptions that, when violated, can result in flawed harmonization. In this paper, we thoroughly review ComBAT's mathematical foundation, outlining these assumptions, and exploring their implications for the demographic composition necessary for optimal results. Through a series of experiments involving a slightly modified version of ComBAT called Pairwise-ComBAT tailored for normative modeling applications, we assess the impact of various population characteristics, including population size, age distribution, the absence of certain covariates, and the magnitude of additive and multiplicative factors. Based on these experiments, we present five essential recommendations that should be carefully considered to enhance consistency and supporting reproducibility, two essential factors for open science, collaborative research, and real-life clinical deployment.

Paper Structure

This paper contains 39 sections, 17 equations, 30 figures.

Figures (30)

  • Figure 1: Illustration of the seven steps of a typical Pairwise-ComBAT harmonization of two sites underlying the effect of each variable of Eq.(17). From the raw data in a) to the harmonized data in j). The gray curves in b) illustrate the overall trend of the population from site 1 and site 2, whereas the scatter plots show the values of the $J$ subjects of each site.
  • Figure 2: Pairwise-ComBAT harmonization of the mean diffusivity (MD) in the modified CamCAN dataset against the unbiased CamCAN (N=441). (a) The parameters A (additive), M (multiplicative), S (slope) used to generate the modified CamCAN version (c.f. Eq.(\ref{['eq:combat_M1_M2_A_Bias']}). (b) (left) raw data with the slopes for CamCAN (solid black line), modified CamCAN (dashed black line), and the slope estimated by Pairwise-ComBAT (solid blue line). (right) The Pairwise-ComBAT harmonization. (c) Distances between the raw and harmonized populations.
  • Figure 3: Experiment 1. Harmonization of the Modified-CamCAN population (in gray) on the CamCAN population (in red). Results for different multiplicative factors on (a) the bias (A) (b) the Slope (S) and (c) the noise variance (M) as described in eq. (\ref{['eq:combat_M1_M2_A_Bias']}). The left columns are the original data and the right columns are the harmonized data. Green checks indicate correct harmonization, red X's indicate poor harmonization.
  • Figure 4: a) The quadratic error in the estimation of the moving site variance $\hat{\delta}_{iv}^{2*}$ for different slope and variance multiplicative factors $S$ and $M$. b) Harmonization of the mean diffusivity (MD) of the NIMH population (red) onto that of CamCAN (gray). The different slopes between NIMH and CamCAN (left) leads to erroneous harmonization outcomes (right). Red cross indicates poor harmonization.
  • Figure 5: (a) Mean Absolute Difference (MAD) training and testing harmonization errors for various number of training samples $N$ from the Modified-CamCAN moving site. (b) and (c) illustrate the effect of using too few or enough training samples on the training and testing fit. Green checks indicate correct harmonization, red X's indicate poor harmonization.
  • ...and 25 more figures