Table of Contents
Fetching ...

Cluster Size Matters: A Comparative Study of Notip and pARI for Post Hoc Inference in fMRI

Nils Peyrouset, Pierre Neuvial, Bertrand Thirion

TL;DR

This paper reevaluates the comparative performance of Notip and pARI, two permutation-based extensions of ARI for post hoc inference in fMRI. It shows that Notip and pARI exhibit complementary regimes: Notip provides informative drill-down bounds for small clusters and robust results across many contrasts, while pARI offers stronger bounds for large clusters but can yield non-informative results for smaller clusters. There is no universal winner; the choice hinges on cluster-size regime and pre-specified hyperparameters ($\\delta$ and $K$). The findings emphasize the value of template-based JER calibration and the practical utility of drill-down inference for identifying activating subregions in neuroimaging data.

Abstract

All Resolutions Inference (ARI) is a post hoc inference method for functional Magnetic Resonance Imaging (fMRI) data analysis that provides valid lower bounds on the proportion of truly active voxels within any, possibly data-driven, cluster. As such, it addresses the paradox of spatial specificity encountered with more classical cluster-extent thresholding methods. It allows the cluster-forming threshold to be increased in order to locate the signal with greater spatial precision without overfitting, also known as the drill-down approach. Notip and pARI are two recent permutation-based extensions of ARI designed to increase statistical power by accounting for the strong dependence structure typical of fMRI data. A recent comparison between these papers based on large voxel clusters concluded that pARI outperforms Notip. We revisit this conclusion by conducting a systematic comparison of the two. Our reanalysis of the same fMRI data sets from the Neurovault database demonstrates the existence of complementary performance regimes: while pARI indeed achieves higher sensitivity for large clusters, Notip provides more informative and robust results for smaller clusters. In particular, while Notip supports informative ``drill-down'' exploration into subregions of activation, pARI often yields non-informative bounds in such cases, and can even underperform the baseline ARI method.

Cluster Size Matters: A Comparative Study of Notip and pARI for Post Hoc Inference in fMRI

TL;DR

This paper reevaluates the comparative performance of Notip and pARI, two permutation-based extensions of ARI for post hoc inference in fMRI. It shows that Notip and pARI exhibit complementary regimes: Notip provides informative drill-down bounds for small clusters and robust results across many contrasts, while pARI offers stronger bounds for large clusters but can yield non-informative results for smaller clusters. There is no universal winner; the choice hinges on cluster-size regime and pre-specified hyperparameters ( and ). The findings emphasize the value of template-based JER calibration and the practical utility of drill-down inference for identifying activating subregions in neuroimaging data.

Abstract

All Resolutions Inference (ARI) is a post hoc inference method for functional Magnetic Resonance Imaging (fMRI) data analysis that provides valid lower bounds on the proportion of truly active voxels within any, possibly data-driven, cluster. As such, it addresses the paradox of spatial specificity encountered with more classical cluster-extent thresholding methods. It allows the cluster-forming threshold to be increased in order to locate the signal with greater spatial precision without overfitting, also known as the drill-down approach. Notip and pARI are two recent permutation-based extensions of ARI designed to increase statistical power by accounting for the strong dependence structure typical of fMRI data. A recent comparison between these papers based on large voxel clusters concluded that pARI outperforms Notip. We revisit this conclusion by conducting a systematic comparison of the two. Our reanalysis of the same fMRI data sets from the Neurovault database demonstrates the existence of complementary performance regimes: while pARI indeed achieves higher sensitivity for large clusters, Notip provides more informative and robust results for smaller clusters. In particular, while Notip supports informative ``drill-down'' exploration into subregions of activation, pARI often yields non-informative bounds in such cases, and can even underperform the baseline ARI method.

Paper Structure

This paper contains 14 sections, 4 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Clusters identified with threshold $z = 3.5$ for the "Look negative cue" vs "Look negative rating" data set: glass brain plot (top) and comparison between TDP lower bounds (bottom) For each cluster, the values in bold indicate the best result. Only clusters for which signal is detected by at least one method are reported.
  • Figure 2: Confidence curve on the TDP for the "Look negative cue vs Look negative rating" contrast: for each $k \in [m]$, we plot the TDP lower bound $\overline{\mathrm{TDP}}(S_k)$, where $S_k = \{i \in \mathcal{H}, |Z_i| \geq Z_{(k)}\}$ is the set of voxels with the $k$ largest $Z$ scores.
  • Figure 3: Lower bound on the True Discovery Proportion $\overline{\mathrm{TDP}}(S)$ as a function of cluster size $|S|$, for each cluster $S$ identified at the cluster forming threshold $z=3$ (left panel), $z=4$ (center panel), and $z=5$ (right panel).