Table of Contents
Fetching ...

Hybrid eTFCE-GRF: Exact Cluster-Size Retrieval with Analytical p-Values for Voxel-Based Morphometry

Don Yin, Hao Chen, Takeshi Miki, Boxing Liu, Enyu Yang

Abstract

Threshold-free cluster enhancement (TFCE) integrates cluster extent across thresholds to improve voxel-wise neuroimaging inference, but permutation testing makes it prohibitively slow for large datasets. Probabilistic TFCE (pTFCE) uses analytical Gaussian random field (GRF) p-values but discretises the threshold grid. Exact TFCE (eTFCE) eliminates discretisation via a union-find data structure but still requires permutations. We combine eTFCE's union-find for exact cluster-size retrieval with pTFCE's analytical GRF inference. The union-find builds the cluster hierarchy in one pass over sorted voxels and enables exact size queries at any threshold; GRF theory then converts these sizes to analytical p-values without permutations. Validation on synthetic phantoms (64^3, 80 subjects): FWER controlled at nominal level (0/200 null rejections, 95% CI [0.0%, 1.9%]); power matches baseline pTFCE (Dice >= 0.999); smoothness error below 1%; concordance r > 0.99. On UK Biobank (N=500) and IXI (N=563), significance maps form strict subsets of reference R pTFCE, which supports conservative error control. Implemented in pytfce (pip install pytfce): baseline completes whole-brain VBM in ~5s (75x faster than R pTFCE), hybrid in ~85s (4.6x faster) with exact cluster sizes; both >1000x faster than permutation TFCE.

Hybrid eTFCE-GRF: Exact Cluster-Size Retrieval with Analytical p-Values for Voxel-Based Morphometry

Abstract

Threshold-free cluster enhancement (TFCE) integrates cluster extent across thresholds to improve voxel-wise neuroimaging inference, but permutation testing makes it prohibitively slow for large datasets. Probabilistic TFCE (pTFCE) uses analytical Gaussian random field (GRF) p-values but discretises the threshold grid. Exact TFCE (eTFCE) eliminates discretisation via a union-find data structure but still requires permutations. We combine eTFCE's union-find for exact cluster-size retrieval with pTFCE's analytical GRF inference. The union-find builds the cluster hierarchy in one pass over sorted voxels and enables exact size queries at any threshold; GRF theory then converts these sizes to analytical p-values without permutations. Validation on synthetic phantoms (64^3, 80 subjects): FWER controlled at nominal level (0/200 null rejections, 95% CI [0.0%, 1.9%]); power matches baseline pTFCE (Dice >= 0.999); smoothness error below 1%; concordance r > 0.99. On UK Biobank (N=500) and IXI (N=563), significance maps form strict subsets of reference R pTFCE, which supports conservative error control. Implemented in pytfce (pip install pytfce): baseline completes whole-brain VBM in ~5s (75x faster than R pTFCE), hybrid in ~85s (4.6x faster) with exact cluster sizes; both >1000x faster than permutation TFCE.
Paper Structure (40 sections, 11 equations, 15 figures, 4 tables, 1 algorithm)

This paper contains 40 sections, 11 equations, 15 figures, 4 tables, 1 algorithm.

Figures (15)

  • Figure 1: Phantom specification used in the Monte Carlo validation. Three non-overlapping ellipsoidal signal regions are embedded in a $64^3$ volume with 80 simulated subjects per realisation. The signal amplitude $a$ is varied across experiments.
  • Figure 2: Demographic distributions showing the joint distribution of age and sex across acquisition sites. (a) IXI dataset ($N = 563$, three sites). (b) UK Biobank dataset ($N = 500$, four sites).
  • Figure 3: Null calibration across 200 independent null realisations ($a = 0$). The cumulative rejection count remains at zero for both the hybrid eTFCE-- and baseline methods at the nominal $\alpha = 0.05$ threshold. The shaded band indicates the Wilson 95% confidence interval $[0.0\%,\, 1.9\%]$ for the rejection proportion.
  • Figure 4: (a) Power curves showing Dice coefficient between detected and true signal regions as a function of signal amplitude $a$. The hybrid eTFCE-- curve (orange) overlaps the baseline curve (blue) at all amplitudes. Error bars denote $\pm 1$ standard deviation across 50 realisations per amplitude. (b) Spatial detection maps at $a = 0.5$ for the three variants. All three methods recover the true signal regions with Dice $= 1.0$ and zero false positives.
  • Figure 5: (a) Smoothness estimation validation across 50 null realisations. The estimated ($3.506 \pm 0.041$ voxels) closely matches the analytical value ($3.532$ voxels, dashed line), with a relative error of $-0.7\%$. (b) Wall-clock runtimes on a logarithmic scale for five inference methods across three datasets. Each point represents one timed run; analytical methods used 30 repeats on each dataset (emulated phantom, IXI, and UK Biobank), while permutation-based methods used 3 repeats on the emulated phantom only. The analytical methods are orders of magnitude faster than the permutation-based methods, which were feasible only on the $64^3$ phantom.
  • ...and 10 more figures