Table of Contents
Fetching ...

Multimodal normative modeling in Alzheimers Disease with introspective variational autoencoders

Sayantan Kumar, Peijie Qiu, Aristeidis Sotiras

TL;DR

This work tackles heterogeneity in Alzheimer's disease by advancing normative modeling with mmSIVAE, a multimodal soft-introspective VAE that uses MOPOE to fuse MRI and amyloid-PET signals. The model learns a faithful healthy reference distribution, yielding sharper latent representations and more discriminative subject-specific deviation scores for outlier detection. Deviation maps in latent and feature spaces align with established AD pathology, offering interpretable region-level patterns and potential for patient stratification. The approach demonstrates improved reconstruction of controls, stronger control–disease separation, and robust multimodal integration with implications for deviation-based analyses across multimodal clinical data.

Abstract

Normative modeling learns a healthy reference distribution and quantifies subject-specific deviations to capture heterogeneous disease effects. In Alzheimers disease (AD), multimodal neuroimaging offers complementary signals but VAE-based normative models often (i) fit the healthy reference distribution imperfectly, inflating false positives, and (ii) use posterior aggregation (e.g., PoE/MoE) that can yield weak multimodal fusion in the shared latent space. We propose mmSIVAE, a multimodal soft-introspective variational autoencoder combined with Mixture-of-Product-of-Experts (MOPOE) aggregation to improve reference fidelity and multimodal integration. We compute deviation scores in latent space and feature space as distances from the learned healthy distributions, and map statistically significant latent deviations to regional abnormalities for interpretability. On ADNI MRI regional volumes and amyloid PET SUVR, mmSIVAE improves reconstruction on held-out controls and produces more discriminative deviation scores for outlier detection than VAE baselines, with higher likelihood ratios and clearer separation between control and AD-spectrum cohorts. Deviation maps highlight region-level patterns aligned with established AD-related changes. More broadly, our results highlight the importance of training objectives that prioritize reference-distribution fidelity and robust multimodal posterior aggregation for normative modeling, with implications for deviation-based analysis across multimodal clinical data.

Multimodal normative modeling in Alzheimers Disease with introspective variational autoencoders

TL;DR

This work tackles heterogeneity in Alzheimer's disease by advancing normative modeling with mmSIVAE, a multimodal soft-introspective VAE that uses MOPOE to fuse MRI and amyloid-PET signals. The model learns a faithful healthy reference distribution, yielding sharper latent representations and more discriminative subject-specific deviation scores for outlier detection. Deviation maps in latent and feature spaces align with established AD pathology, offering interpretable region-level patterns and potential for patient stratification. The approach demonstrates improved reconstruction of controls, stronger control–disease separation, and robust multimodal integration with implications for deviation-based analyses across multimodal clinical data.

Abstract

Normative modeling learns a healthy reference distribution and quantifies subject-specific deviations to capture heterogeneous disease effects. In Alzheimers disease (AD), multimodal neuroimaging offers complementary signals but VAE-based normative models often (i) fit the healthy reference distribution imperfectly, inflating false positives, and (ii) use posterior aggregation (e.g., PoE/MoE) that can yield weak multimodal fusion in the shared latent space. We propose mmSIVAE, a multimodal soft-introspective variational autoencoder combined with Mixture-of-Product-of-Experts (MOPOE) aggregation to improve reference fidelity and multimodal integration. We compute deviation scores in latent space and feature space as distances from the learned healthy distributions, and map statistically significant latent deviations to regional abnormalities for interpretability. On ADNI MRI regional volumes and amyloid PET SUVR, mmSIVAE improves reconstruction on held-out controls and produces more discriminative deviation scores for outlier detection than VAE baselines, with higher likelihood ratios and clearer separation between control and AD-spectrum cohorts. Deviation maps highlight region-level patterns aligned with established AD-related changes. More broadly, our results highlight the importance of training objectives that prioritize reference-distribution fidelity and robust multimodal posterior aggregation for normative modeling, with implications for deviation-based analysis across multimodal clinical data.
Paper Structure (34 sections, 32 equations, 5 figures, 1 table)

This paper contains 34 sections, 32 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Training flow of mmSIVAE. The ELBO for real samples is optimized for both encoders and decoders, while the encoders also optimize the expELBO to ’push away’ generated samples from the latent space. The decoders optimize the ELBO for the generated samples to 'fool' the encoders.
  • Figure 2: Training flow of mmSIVAE. The ELBO for real samples is optimized for both encoders and decoders, while the encoders also optimize the expELBO to ’push away’ generated samples from the latent space. The decoders optimize the ELBO for the generated samples to 'fool' the encoders.
  • Figure 3: Average reconstruction error when reconstructing MRI volumes (a) and amyloid SUVR (b) for our proposed mmSIVAE and baselines (SIVAE, mmVAE and unimodal VAE). The brain atlases showing the reconstruction errors for both MRI volumes and amyloid SUVR were visualized using the ggseg package in Python.
  • Figure 4: Histograms showing the distribution of feature mahalanobis deviations $D_{mf}$ (Eq \ref{['d_mf']}) for the test (AD patients) and a holdout healthy control cohort. The values in the caption of each subfigure indicate the Earth mover's distance between the 2 distributions train and test. Higher distance indicate lesser overlap/better separation between the healthy and disease cohorts.
  • Figure 5: A.Five latent dimensions (6, 7, 8, 14, and 15) out of 15 show statistically significant deviations, defined by mean absolute $Z_{ml} > 1.96$ ($p<0.05$). The dotted red line denotes the significance threshold ($Z=1.96$); latent dimensions exceeding this threshold are used for mapping to feature-space deviations. B. Brain atlas maps (Desikan–Killiany atlas for 66 cortical regions and Aseg atlas for 24 subcortical regions) illustrate pairwise group differences in regional deviation magnitude between control and disease groups. Colors indicate effect size (Cohen’s $d$), with $d=0.2$, $0.5$, and $0.8$ corresponding to small, medium, and large effects, respectively. Gray regions indicate no statistically significant differences after FDR correction.