Denoising diffusion networks for normative modeling in neuroimaging
Luke Whitbread, Lyle J. Palmer, Mark Jenkinson
TL;DR
This work addresses the need for calibrated, multivariate normative modeling in neuroimaging by proposing conditional denoising diffusion probabilistic models (DDPMs) to estimate joint IDP densities $\hat{p}(\mathbf{y}|\mathbf{c})$. It compares two backbones, FiLM-conditioned MLP and SAINT, on synthetic and UK Biobank data across up to 200 IDPs, and evaluates centile calibration, distributional fidelity, and dependence structure using a comprehensive suite (ACE, ECP, PIT, KS, energy distance, MMD, pairwise and higher-order analyses) plus memorisation checks. The SAINT backbone consistently yields better calibration and preserves higher-order dependence at high dimensions, enabling scalable joint normative modeling while remaining compatible with standard per-IDP pipelines; diffusion models also support conditional sampling for centile estimation and counterfactual analysis. Overall, diffusion-based normative modeling offers a flexible, data-driven route to calibrated multivariate deviation profiles with practical utility for personalized biomarkers and robust stress-testing of neuroimaging pipelines. The work provides code for replication and demonstrates the potential of tabular diffusion approaches to transform normative analyses beyond univariate IDPs.
Abstract
Normative modeling estimates reference distributions of biological measures conditional on covariates, enabling centiles and clinically interpretable deviation scores to be derived. Most neuroimaging pipelines fit one model per imaging-derived phenotype (IDP), which scales well but discards multivariate dependence that may encode coordinated patterns. We propose denoising diffusion probabilistic models (DDPMs) as a unified conditional density estimator for tabular IDPs, from which univariate centiles and deviation scores are derived by sampling. We utilise two denoiser backbones: (i) a feature-wise linear modulation (FiLM) conditioned multilayer perceptron (MLP) and (ii) a tabular transformer with feature self-attention and intersample attention (SAINT), conditioning covariates through learned embeddings. We evaluate on a synthetic benchmark with heteroscedastic and multimodal age effects and on UK Biobank FreeSurfer phenotypes, scaling from dimension of 2 to 200. Our evaluation suite includes centile calibration (absolute centile error, empirical coverage, and the probability integral transform), distributional fidelity (Kolmogorov-Smirnov tests), multivariate dependence diagnostics, and nearest-neighbour memorisation analysis. For low dimensions, diffusion models deliver well-calibrated per-IDP outputs comparable to traditional baselines while jointly modeling realistic dependence structure. At higher dimensions, the transformer backbone remains substantially better calibrated than the MLP and better preserves higher-order dependence, enabling scalable joint normative models that remain compatible with standard per-IDP pipelines. These results support diffusion-based normative modeling as a practical route to calibrated multivariate deviation profiles in neuroimaging.
