Table of Contents
Fetching ...

GPT2MEG: Quantizing MEG for Autoregressive Generation

Richard Csaky, Mats W. J. van Es, Oiwi Parker Jones, Mark Woolrich

TL;DR

GPT2MEG more faithfully reproduces temporal, spectral, and task-evoked statistics of real MEG than WaveNet variants and linear autoregressive baselines, and scales to multiple subjects via subject embeddings.

Abstract

Foundation models trained with self-supervised objectives are increasingly applied to brain recordings, but autoregressive generation of realistic multichannel neural time series remains comparatively underexplored, particularly for Magnetoencephalography (MEG). We study (i) modified multichannel WaveNet variants and (ii) a GPT-2-style Transformer, autoregressively trained by next-step prediction on unlabelled MEG. For the Transformer, we propose a simple quantization/tokenization and embedding scheme (channel, subject, and task-condition embeddings) that repurposes a language-model architecture for continuous, high-rate multichannel time series and enables conditional simulation of task-evoked activity. Across forecasting, long-horizon generation, and downstream decoding, GPT2MEG more faithfully reproduces temporal, spectral, and task-evoked statistics of real MEG than WaveNet variants and linear autoregressive baselines, and scales to multiple subjects via subject embeddings. Code available at https://github.com/ricsinaruto/MEG-transfer-decoding.

GPT2MEG: Quantizing MEG for Autoregressive Generation

TL;DR

GPT2MEG more faithfully reproduces temporal, spectral, and task-evoked statistics of real MEG than WaveNet variants and linear autoregressive baselines, and scales to multiple subjects via subject embeddings.

Abstract

Foundation models trained with self-supervised objectives are increasingly applied to brain recordings, but autoregressive generation of realistic multichannel neural time series remains comparatively underexplored, particularly for Magnetoencephalography (MEG). We study (i) modified multichannel WaveNet variants and (ii) a GPT-2-style Transformer, autoregressively trained by next-step prediction on unlabelled MEG. For the Transformer, we propose a simple quantization/tokenization and embedding scheme (channel, subject, and task-condition embeddings) that repurposes a language-model architecture for continuous, high-rate multichannel time series and enables conditional simulation of task-evoked activity. Across forecasting, long-horizon generation, and downstream decoding, GPT2MEG more faithfully reproduces temporal, spectral, and task-evoked statistics of real MEG than WaveNet variants and linear autoregressive baselines, and scales to multiple subjects via subject embeddings. Code available at https://github.com/ricsinaruto/MEG-transfer-decoding.
Paper Structure (32 sections, 10 equations, 21 figures, 1 table)

This paper contains 32 sections, 10 equations, 21 figures, 1 table.

Figures (21)

  • Figure 1: Next-timestep prediction performance across the different forecasting models. Accuracy values are on left-out test data and are given in 0-1 units. Chance-level is $1/256$, however predicting the majority class (quantised value) is somewhat higher, since the distribution over bins is not uniform. ChannelGPT refers to GPT2MEG.
  • Figure 2: PSD comparison between real MEG (a) and long-horizon generations from different models (b--d) for a representative subject. Each line is a sensor/channel. Additional WFCM sampling results, generally producing less realistic spectra, are in Appendix \ref{['ssec:appendix_psd_wfcm']}.
  • Figure 3: Distributions of dynamics summary statistics across the 12 states from an HMM inferred on real MEG multi-channel data from a single sample subject [left column], and from multi-channel data generated from 4 different forecasting models trained on the single sample subject. Four different summary statistics are shown describing the state dynamics (rows). ChannelGPT refers to GPT2MEG.
  • Figure 4: Comparison of evoked timecourses of 2 channels across the task-conditioned models (via task-condition embedding); for real MEG data [blue line] from a single sample subject, and from data generated from the different forecasting models trained on the single sample subject. Stimulus onset is at 0 seconds and offset is at 500ms. The peak occurring after 500 ms indicates a visual response to the offset of the stimulus (removal of the image). Shading indicates the 95% confidence interval of the trial average. ChannelGPT refers to GPT2MEG.
  • Figure 5: Comparison of evoked responses in a visual channel (MEG2332) for a single sample subject; using: real MEG data (blue), data generated from the GPT2MEG model trained on data from the single sample subject (orange), and data generated for the single sample subject (using an appropriate subject embedding) from the GPT2MEG-group model trained on all subjects (green). The stimulus onset is at 0 s and the stimulus offset is at 500 ms. Shading indicates 95% confidence of the trial mean. ChannelGPT refers to GPT2MEG.
  • ...and 16 more figures