Table of Contents
Fetching ...

Scalable Bayesian Image-on-Scalar Regression for Population-Scale Neuroimaging Data Analysis

Yuliang Xu, Timothy D. Johnson, Thomas E. Nichols, Jian Kang

TL;DR

The paper tackles scalable, uncertainty‑aware analysis of population‑scale fMRI via Bayesian Image‑on‑Scalar Regression (ISR) that accommodates subject‑specific masks. It introduces SBIOS, combining Gaussian Process priors with a voxelwise inclusion indicator and a memory‑mapped, mini‑batch SGLD posterior sampler to achieve linear scaling in batch size and direct spatial inference. On the UK Biobank dataset ($n=38{,}639$, $p>10^5$ voxels, $R=110$ regions), SBIOS demonstrates $4$–$11$× speedups and $8$–$18$ extpercent power gains over Gibbs sampling with zero imputation, and identifies an amygdala subregion where emotion‑related activation declines by about $58$ extpercent between ages $50$ and $60$. These advances enable reliable, voxel‑level activation inferences in large‑scale neuroimaging, leveraging subject‑specific masks through imputation and providing principled uncertainty quantification via posterior inclusion probabilities.

Abstract

Bayesian Image-on-Scalar Regression (ISR) provides flexible, uncertainty-aware neuroimaging analysis. However, applying ISR to large-scale datasets such as the UK Biobank is challenging due to intensive computational demands and the need to handle subject-specific brain masks rather than a common mask. We propose a novel Bayesian ISR model that scales efficiently while accommodating these inconsistent masks. Our method leverages Gaussian process priors with salience area indicators and introduces a scalable posterior computation algorithm using stochastic gradient Langevin dynamics combined with memory mapping. This approach achieves linear scaling with subsample size and constrains memory usage to the batch size, facilitating direct spatial posterior inferences on brain activation regions. Simulation studies and analysis of UK Biobank task fMRI data (38,639 subjects; over 120,000 voxels per image) demonstrate a 4- to 11-fold speed increase and an 8-18% enhancement in statistical power compared to traditional Gibbs sampling with zero-imputation. Our analysis reveals a subregion of the amygdala where emotion-related brain activation decreases by approximately 58% between ages 50 and 60.

Scalable Bayesian Image-on-Scalar Regression for Population-Scale Neuroimaging Data Analysis

TL;DR

The paper tackles scalable, uncertainty‑aware analysis of population‑scale fMRI via Bayesian Image‑on‑Scalar Regression (ISR) that accommodates subject‑specific masks. It introduces SBIOS, combining Gaussian Process priors with a voxelwise inclusion indicator and a memory‑mapped, mini‑batch SGLD posterior sampler to achieve linear scaling in batch size and direct spatial inference. On the UK Biobank dataset (, voxels, regions), SBIOS demonstrates × speedups and extpercent power gains over Gibbs sampling with zero imputation, and identifies an amygdala subregion where emotion‑related activation declines by about extpercent between ages and . These advances enable reliable, voxel‑level activation inferences in large‑scale neuroimaging, leveraging subject‑specific masks through imputation and providing principled uncertainty quantification via posterior inclusion probabilities.

Abstract

Bayesian Image-on-Scalar Regression (ISR) provides flexible, uncertainty-aware neuroimaging analysis. However, applying ISR to large-scale datasets such as the UK Biobank is challenging due to intensive computational demands and the need to handle subject-specific brain masks rather than a common mask. We propose a novel Bayesian ISR model that scales efficiently while accommodating these inconsistent masks. Our method leverages Gaussian process priors with salience area indicators and introduces a scalable posterior computation algorithm using stochastic gradient Langevin dynamics combined with memory mapping. This approach achieves linear scaling with subsample size and constrains memory usage to the batch size, facilitating direct spatial posterior inferences on brain activation regions. Simulation studies and analysis of UK Biobank task fMRI data (38,639 subjects; over 120,000 voxels per image) demonstrate a 4- to 11-fold speed increase and an 8-18% enhancement in statistical power compared to traditional Gibbs sampling with zero-imputation. Our analysis reveals a subregion of the amygdala where emotion-related brain activation decreases by approximately 58% between ages 50 and 60.
Paper Structure (36 sections, 18 equations, 20 figures, 6 tables, 2 algorithms)

This paper contains 36 sections, 18 equations, 20 figures, 6 tables, 2 algorithms.

Figures (20)

  • Figure 1: Incremental Differences of BIOS, SBIOS0, and SBIOSimp.
  • Figure 2: Analysis mask using an observed proportion threshold of 0.5 and an intersection mask (completely observed data). The purple area indicates 100% inclusion; the blue area indicates the mask with an observed proportion between 0.5 and 1.0.
  • Figure 3: Illustration of age-related activation patterns using a grayscale brain background image (ch2bet, holmes1998enhancement). Images are created using MRIcron rorden2000stereotaxic.
  • Figure 4: Illustrations on the amygdala region
  • Figure 5: Scatter plot of the posterior mean of $\beta(s_j)I(\text{PIP($s_j$)}\geq 0.95)$ based on SBIOS0 (x-axis) and SBIOSimp (y-axis) on six selected regions with high missingness. Blue dots indicate voxels with observed proportion $h(s_j)\in [0.5,0.7)$. Red dots indicate voxels with observed proportion $h(s_j)\in [0.7,0.9)$. Black dots indicate voxels with observed proportion $h(s_j)\in [0.9,1]$.
  • ...and 15 more figures