Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

Ling Zhan; Zhen Li; Junjie Huang; Tao Jia

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

Ling Zhan, Zhen Li, Junjie Huang, Tao Jia

TL;DR

This work tackles the computational bottleneck of benchmarking hundreds of FC operators by reframing it as ranking-preserving core-set selection. It introduces Structure-aware Contrastive Learning for Core-set Selection (SCLCS), built on an adaptive multi-head Transformer encoder to learn sample-specific FC structures, the Structure Perturbation Score ($SPS$) to identify structurally stable samples, and a density-balanced sampling strategy to ensure diversity. The authors prove a universal approximation property for the adaptive attention mechanism and demonstrate improved ranking preservation on the REST-meta-MDD dataset, achieving near ground-truth SPI rankings with only 10% of the data. This approach makes large-scale FC operator benchmarking practical and reproducible, potentially accelerating pre-analysis model selection in computational neuroscience.

Abstract

Benchmarking the hundreds of functional connectivity (FC) modeling methods on large-scale fMRI datasets is critical for reproducible neuroscience. However, the combinatorial explosion of model-data pairings makes exhaustive evaluation computationally prohibitive, preventing such assessments from becoming a routine pre-analysis step. To break this bottleneck, we reframe the challenge of FC benchmarking by selecting a small, representative core-set whose sole purpose is to preserve the relative performance ranking of FC operators. We formalize this as a ranking-preserving subset selection problem and propose Structure-aware Contrastive Learning for Core-set Selection (SCLCS), a self-supervised framework to select these core-sets. SCLCS first uses an adaptive Transformer to learn each sample's unique FC structure. It then introduces a novel Structural Perturbation Score (SPS) to quantify the stability of these learned structures during training, identifying samples that represent foundational connectivity archetypes. Finally, while SCLCS identifies stable samples via a top-k ranking, we further introduce a density-balanced sampling strategy as a necessary correction to promote diversity, ensuring the final core-set is both structurally robust and distributionally representative. On the large-scale REST-meta-MDD dataset, SCLCS preserves the ground-truth model ranking with just 10% of the data, outperforming state-of-the-art (SOTA) core-set selection methods by up to 23.2% in ranking consistency (nDCG@k). To our knowledge, this is the first work to formalize core-set selection for FC operator benchmarking, thereby making large-scale operators comparisons a feasible and integral part of computational neuroscience. Code is publicly available on https://github.com/lzhan94swu/SCLCS

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

TL;DR

) to identify structurally stable samples, and a density-balanced sampling strategy to ensure diversity. The authors prove a universal approximation property for the adaptive attention mechanism and demonstrate improved ranking preservation on the REST-meta-MDD dataset, achieving near ground-truth SPI rankings with only 10% of the data. This approach makes large-scale FC operator benchmarking practical and reproducible, potentially accelerating pre-analysis model selection in computational neuroscience.

Abstract

Paper Structure (63 sections, 8 theorems, 47 equations, 8 figures, 14 tables)

This paper contains 63 sections, 8 theorems, 47 equations, 8 figures, 14 tables.

Introduction
Related Work
Preliminaries
Benchmarking FC modeling.
Core-set Selection for Benchmarking FC Modeling.
Method
Attention-based FC Learning
Structural Perturbation Score (SPS)
Structure-aware Density-Balanced Sampling
Structure-aware Contrastive Learning
Experiment
Experimental Settings
Data
Baselines
Environment
...and 48 more sections

Key Result

Theorem 1

Let $\{\mathbf{A}_h\}_{h=1}^H$ be row-stochastic attention matrices. Assume disjoint structural masks: for each row $i$ there exist pairwise-disjoint sets $\{S_h^{(i)}\}_{h=1}^H$ such that $\mathbf{A}_h(i,j)=0$ for all $j\notin S_h^{(i)}$. Let $\bar{\mathbf{A}}:=\tfrac{1}{H}\sum_{h=1}^H \mathbf{A}_h In particular, if $H\ge2$, naive averaging expands support beyond any single head's mask and inflat

Figures (8)

Figure 1: Overview of the SCLCS framework for ranking-preserving core-set selection. Contrasting with selection for single-model classification (top left), our task is to preserve the performance ranking of SPIs (top right). Our method (bottom) achieves this using a Transformer to learn structures, our novel SPS metric to ensure stability, and a density-aware strategy to promote diversity.
Figure 2: Sample coverage balance on subjects and MDD/HC of baselines.
Figure 3: The evolution of the learned attention map $\textbf{A}_{(\textbf{X})}^e$ across training epochs.
Figure A1: Time consumption of different SPIs on a single sample.
Figure A2: Rank comparison on brain fingerprinting using rank/density-based sampling strategies.
...and 3 more figures

Theorems & Definitions (15)

Theorem 1: Interference of Averaged Attention
Theorem 2: Universal Approximation of Continuous Stochastic SPIs
Proposition 1: Mixture-driven perturbation magnitude
Theorem 3: Persistent bias of top-$k$ selection
Theorem : Interference of Averaged Attention, full version
proof
proof
proof
Lemma 1: Consistency of SPS
proof
...and 5 more

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

TL;DR

Abstract

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (15)