Table of Contents
Fetching ...

Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model

Beige Jerry Jin, Leila Wehbe

TL;DR

The paper presents a transformer-based encoding model that jointly predicts MEG and fMRI from naturalistic speech while inferring latent cortical source activity with millisecond precision and millimeter spatial resolution. By embedding anatomical forward models (lead fields) and source morphing, the model maps stimulus features to a shared fsaverage-derived source space and then to modality-specific predictions, enabling cross-subject generalization and validation with ECoG data. Simulation and ECoG results demonstrate accurate recovery of time courses and spatial patterns, and zero-shot predictions show substantial electrode-level correspondence in unseen data. This integrative approach advances non-invasive brain mapping by combining high temporal and spatial fidelity in naturalistic paradigms, with potential for richer language and cognition studies.

Abstract

Current non-invasive neuroimaging techniques trade off between spatial resolution and temporal resolution. While magnetoencephalography (MEG) can capture rapid neural dynamics and functional magnetic resonance imaging (fMRI) can spatially localize brain activity, a unified picture that preserves both high resolutions remains an unsolved challenge with existing source localization or MEG-fMRI fusion methods, especially for single-trial naturalistic data. We collected whole-head MEG when subjects listened passively to more than seven hours of narrative stories, using the same stimuli in an open fMRI dataset (LeBel et al., 2023). We developed a transformer-based encoding model that combines the MEG and fMRI from these two naturalistic speech comprehension experiments to estimate latent cortical source responses with high spatiotemporal resolution. Our model is trained to predict MEG and fMRI from multiple subjects simultaneously, with a latent layer that represents our estimates of reconstructed cortical sources. Our model predicts MEG better than the common standard of single-modality encoding models, and it also yields source estimates with higher spatial and temporal fidelity than classic minimum-norm solutions in simulation experiments. We validated the estimated latent sources by showing its strong generalizability across unseen subjects and modalities. Estimated activity in our source space predict electrocorticography (ECoG) better than an ECoG-trained encoding model in an entirely new dataset. By integrating the power of large naturalistic experiments, MEG, fMRI, and encoding models, we propose a practical route towards millisecond-and-millimeter brain mapping.

Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model

TL;DR

The paper presents a transformer-based encoding model that jointly predicts MEG and fMRI from naturalistic speech while inferring latent cortical source activity with millisecond precision and millimeter spatial resolution. By embedding anatomical forward models (lead fields) and source morphing, the model maps stimulus features to a shared fsaverage-derived source space and then to modality-specific predictions, enabling cross-subject generalization and validation with ECoG data. Simulation and ECoG results demonstrate accurate recovery of time courses and spatial patterns, and zero-shot predictions show substantial electrode-level correspondence in unseen data. This integrative approach advances non-invasive brain mapping by combining high temporal and spatial fidelity in naturalistic paradigms, with potential for richer language and cognition studies.

Abstract

Current non-invasive neuroimaging techniques trade off between spatial resolution and temporal resolution. While magnetoencephalography (MEG) can capture rapid neural dynamics and functional magnetic resonance imaging (fMRI) can spatially localize brain activity, a unified picture that preserves both high resolutions remains an unsolved challenge with existing source localization or MEG-fMRI fusion methods, especially for single-trial naturalistic data. We collected whole-head MEG when subjects listened passively to more than seven hours of narrative stories, using the same stimuli in an open fMRI dataset (LeBel et al., 2023). We developed a transformer-based encoding model that combines the MEG and fMRI from these two naturalistic speech comprehension experiments to estimate latent cortical source responses with high spatiotemporal resolution. Our model is trained to predict MEG and fMRI from multiple subjects simultaneously, with a latent layer that represents our estimates of reconstructed cortical sources. Our model predicts MEG better than the common standard of single-modality encoding models, and it also yields source estimates with higher spatial and temporal fidelity than classic minimum-norm solutions in simulation experiments. We validated the estimated latent sources by showing its strong generalizability across unseen subjects and modalities. Estimated activity in our source space predict electrocorticography (ECoG) better than an ECoG-trained encoding model in an entirely new dataset. By integrating the power of large naturalistic experiments, MEG, fMRI, and encoding models, we propose a practical route towards millisecond-and-millimeter brain mapping.

Paper Structure

This paper contains 27 sections, 1 equation, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Integration of MEG and fMRI. Our work integrates the millisecond-level temporal precision of MEG with the millimeter-scale spatial specificity of fMRI to reconstruct cortical source activity at a high spatiotemporal resolution in naturalistic experiments.
  • Figure 2: Architecture of the MEG-fMRI encoding model. Feature streams enter the network through the input layer and traverse four transformer layers before being projected into the "fsaverage" source space by the source layer. The source estimates in the "fsaverage" source space is then transformed into subject-specific source estimates by the source morphing matrix. The MEG head predicts sensor signals by multiplying the source estimates with the lead-field matrix. The fMRI head predicts BOLD responses by convolving the downsampled envelope of the source estimates with a learnable hemodynamic response function (HRF) kernel. The MEG and fMRI of multiple subjects (e.g., S1, S2, ...) are predicted simultaneously. Under the joint constraints of MEG and fMRI from multiple subjects, our model recovers the source estimates with high spatiotemporal resolution. Dashed arrows indicate steps that are pre-computed and not learnable.
  • Figure 3: Predictive performance on MEG and fMRI for two example subjects. Our model is comparable to single-subject, single-modality ridge models which serve as a ceiling. Top: performance on the MEG of S1. Both magnetometer and gradiometer sensors above the temporal lobe are predicted. Bottom: performance on the source-level BOLD signals of S6, shown on the inflated surface. Large parts of bilateral temporal and frontal areas are predicted.
  • Figure 4: Results of simulation experiments. We report the mean Pearson $r$ computed over time within each source (left panel, temporal correlation) and over sources at each time point (right panel, spatial correlation) between the source estimates and the ground truth. Our model outperforms fMNE in both aspects under all noise levels.
  • Figure 5: Predictive performance on a new ECoG dataset. Top left: Performance of our model's zero-shot prediction on a binary classification task. Top right: Mean Pearson $r$ of top 25% electrodes of our model and the linear encoding model under different amount of training data. Notably, training proportion of 0% corresponds to zero-shot prediction. Middle: Correlation map of our model's zero-shot prediction and electrode-wise correlation difference with the linear encoding model. Bottom: Correlation map of our model's trained prediction with 100% training data and electrode-wise correlation difference with the linear encoding model.
  • ...and 3 more figures