Scaling up ridge regression for brain encoding in a massive individual fMRI dataset

Sana Ahmadi; Pierre Bellec; Tristan Glatard

Scaling up ridge regression for brain encoding in a massive individual fMRI dataset

Sana Ahmadi, Pierre Bellec, Tristan Glatard

TL;DR

Batch parallelization using Dask emerges as a scalable approach for brain encoding with ridge regression on high-performance computing systems using scikit-learn and large fMRI datasets.

Abstract

Brain encoding with neuroimaging data is an established analysis aimed at predicting human brain activity directly from complex stimuli features such as movie frames. Typically, these features are the latent space representation from an artificial neural network, and the stimuli are image, audio, or text inputs. Ridge regression is a popular prediction model for brain encoding due to its good out-of-sample generalization performance. However, training a ridge regression model can be highly time-consuming when dealing with large-scale deep functional magnetic resonance imaging (fMRI) datasets that include many space-time samples of brain activity. This paper evaluates different parallelization techniques to reduce the training time of brain encoding with ridge regression on the CNeuroMod Friends dataset, one of the largest deep fMRI resource currently available. With multi-threading, our results show that the Intel Math Kernel Library (MKL) significantly outperforms the OpenBLAS library, being 1.9 times faster using 32 threads on a single machine. We then evaluated the Dask multi-CPU implementation of ridge regression readily available in scikit-learn (MultiOutput), and we proposed a new "batch" version of Dask parallelization, motivated by a time complexity analysis. In line with our theoretical analysis, MultiOutput parallelization was found to be impractical, i.e., slower than multi-threading on a single machine. In contrast, the Batch-MultiOutput regression scaled well across compute nodes and threads, providing speed-ups of up to 33 times with 8 compute nodes and 32 threads compared to a single-threaded scikit-learn execution. Batch parallelization using Dask thus emerges as a scalable approach for brain encoding with ridge regression on high-performance computing systems using scikit-learn and large fMRI datasets.

Scaling up ridge regression for brain encoding in a massive individual fMRI dataset

TL;DR

Batch parallelization using Dask emerges as a scalable approach for brain encoding with ridge regression on high-performance computing systems using scikit-learn and large fMRI datasets.

Abstract

Paper Structure (34 sections, 14 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 34 sections, 14 equations, 10 figures, 4 tables, 1 algorithm.

Introduction
Materials and Methods
fMRI dataset
Friends TV show stimuli
Participants
Magnetic resonance imaging
Preprocessing
Multiresolution time series extraction
Brain encoding
VGG16 artificial vision network
Extracting VGG16 features of dynamic visual stimuli
Ridge regression
Brain encoding performance and hyper-parameter optimization
Ridge regression implementations
Scikit-learn efficient ridge implementation
...and 19 more sections

Figures (10)

Figure 1: The two main steps of brain encoding: Extracting features from movie frames using VGG16 pretrained model and predicting brain response using ridge regression.
Figure 2: Mutilthreading and Distributed parallelism in scikit-learn's ridge regression
Figure 3: Matrix computations in Multi-threading ridgeCV, MOR and B-MOR model fitting. Assuming $X \in \mathbb{R} ^{n \times p}$, $Y \in \mathbb{R} ^ {n \times t}$ and $X= USV^T$ then the weight matrix $B \in \mathbb{R} ^ {p \times t}$ equals to $B = V (S^2 + \lambda I_{p}) ^ {\hbox{[}1.0]{$-$}1} S U^{T} Y$.
Figure 4: Brain encoding results, with performance based on Pearson Correlation Coefficient (r) between real and predicted time series in the friends dataset (N=6 subjects).
Figure 5: Brain encoding predictions for a single individual (sub-01) in two cases. Panel a: corresponding pairs of {fMRI time series and stimuli} were presented to the ridge regression models. Panel b: random permutations of fMRI time series and stimuli data were presented to the ridge regression model.
...and 5 more figures

Scaling up ridge regression for brain encoding in a massive individual fMRI dataset

TL;DR

Abstract

Scaling up ridge regression for brain encoding in a massive individual fMRI dataset

Authors

TL;DR

Abstract

Table of Contents

Figures (10)