Table of Contents
Fetching ...

Across-subject ensemble-learning alleviates the need for large samples for fMRI decoding

Himanshu Aggarwal, Liza Al-Shikhley, Bertrand Thirion

TL;DR

Decoding cognitive states from fMRI is hindered by a high feature-to-sample ratio and poor cross-subject generalization. The authors propose an across-subject ensemble-learning (stacking) approach that pre-trains per-subject classifiers and trains a final meta-learner on their predictions to decode a new subject, evaluated across five datasets with both voxel-space and DiFuMo features and multiple classifiers. The ensemble method yields up to ~20% accuracy gains over conventional decoding, especially when per-subject data are scarce, with full-voxel pre-training and an MLP final classifier providing robust performance. These results demonstrate that cross-subject pre-training can significantly reduce the per-subject data requirements, supporting more efficient fMRI decoding for real-time BCI and cognitive research, while suggesting avenues for deeper pre-training and functional alignment to further enhance cross-subject transfer.

Abstract

Decoding cognitive states from functional magnetic resonance imaging is central to understanding the functional organization of the brain. Within-subject decoding avoids between-subject correspondence problems but requires large sample sizes to make accurate predictions; obtaining such large sample sizes is both challenging and expensive. Here, we investigate an ensemble approach to decoding that combines the classifiers trained on data from other subjects to decode cognitive states in a new subject. We compare it with the conventional decoding approach on five different datasets and cognitive tasks. We find that it outperforms the conventional approach by up to 20% in accuracy, especially for datasets with limited per-subject data. The ensemble approach is particularly advantageous when the classifier is trained in voxel space. Furthermore, a Multi-layer Perceptron turns out to be a good default choice as an ensemble method. These results show that the pre-training strategy reduces the need for large per-subject data.

Across-subject ensemble-learning alleviates the need for large samples for fMRI decoding

TL;DR

Decoding cognitive states from fMRI is hindered by a high feature-to-sample ratio and poor cross-subject generalization. The authors propose an across-subject ensemble-learning (stacking) approach that pre-trains per-subject classifiers and trains a final meta-learner on their predictions to decode a new subject, evaluated across five datasets with both voxel-space and DiFuMo features and multiple classifiers. The ensemble method yields up to ~20% accuracy gains over conventional decoding, especially when per-subject data are scarce, with full-voxel pre-training and an MLP final classifier providing robust performance. These results demonstrate that cross-subject pre-training can significantly reduce the per-subject data requirements, supporting more efficient fMRI decoding for real-time BCI and cognitive research, while suggesting avenues for deeper pre-training and functional alignment to further enhance cross-subject transfer.

Abstract

Decoding cognitive states from functional magnetic resonance imaging is central to understanding the functional organization of the brain. Within-subject decoding avoids between-subject correspondence problems but requires large sample sizes to make accurate predictions; obtaining such large sample sizes is both challenging and expensive. Here, we investigate an ensemble approach to decoding that combines the classifiers trained on data from other subjects to decode cognitive states in a new subject. We compare it with the conventional decoding approach on five different datasets and cognitive tasks. We find that it outperforms the conventional approach by up to 20% in accuracy, especially for datasets with limited per-subject data. The ensemble approach is particularly advantageous when the classifier is trained in voxel space. Furthermore, a Multi-layer Perceptron turns out to be a good default choice as an ensemble method. These results show that the pre-training strategy reduces the need for large per-subject data.
Paper Structure (14 sections, 2 equations, 4 figures, 1 table)

This paper contains 14 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Ensemble-learning (by stacking): Ensemble-learning (by stacking) involves (top) pre-training separate classifiers for each of $N-1$ subjects and then (bottom) training a final classifier to learn the mapping between predictions from each pre-trained classifier and the true labels for $N^{th}$ subject.
  • Figure 2: Average decoding accuracy: Each plot represents a different dataset (along columns). The average decoding accuracy is plotted along the x-axis. The averages are across all training sizes, subjects and 20 cross-validation splits. The error bars represent a 95% confidence interval of the bootstrap distribution. The horizontal line represents the chance level of accuracy.
  • Figure 3: Gain in decoding accuracy when varying the number of training samples per class: Each plot represents a different dataset (along columns). The y-axis shows the average percent gain in decoding accuracy (accuracy of ensemble - accuracy of conventional) across all subjects and 20 cross-validation splits. On the x-axis, training size is reported as the number of samples per class in each cross-validation split. The confidence intervals represent 95% confidence interval of bootstrap distribution. The horizontal line represents no average gain in accuracy and the vertical line, 10 samples per class.
  • Figure 4: Gain in decoding accuracy over a varying number of subjects in the ensemble: Each plot represents a different dataset (along columns). The x-axis represents the number of subjects used in the ensemble method. The y-axis represents the average percent gain in decoding accuracy (accuracy of ensemble - accuracy of conventional) across all training sizes and 5 cross-validation splits. The confidence intervals represent 95% interval of bootstrap distribution. The horizontal line represents no average gain in accuracy and the vertical line at 10 subjects in the ensemble.