Dual Stream Graph Transformer Fusion Networks for Enhanced Brain Decoding
Lucas Goene, Siamak Mehrkanoon
TL;DR
The paper tackles MEG brain decoding by integrating spatial and temporal patterns through a Dual Stream Graph Transformer Fusion (DS-GTF) architecture. It leverages a Graph Attention Module for spatial relations across MEG channels and a Transformer Encoder for temporal dynamics, with end-to-end training that fuses both representations before the output layer. A key contribution is the exploration of three adjacency initialization schemes based on an RBF kernel, including a Top-K strategy that yields notable gains, particularly with $K=3$, and robustness analyses across edge counts. The approach demonstrates improved cross-subject performance and reduced variability on a subset of Human Connectome Project MEG data, highlighting practical potential for robust, domain-agnostic MEG decoding.
Abstract
This paper presents the novel Dual Stream Graph-Transformer Fusion (DS-GTF) architecture designed specifically for classifying task-based Magnetoencephalography (MEG) data. In the spatial stream, inputs are initially represented as graphs, which are then passed through graph attention networks (GAT) to extract spatial patterns. Two methods, TopK and Thresholded Adjacency are introduced for initializing the adjacency matrix used in the GAT. In the temporal stream, the Transformer Encoder receives concatenated windowed input MEG data and learns new temporal representations. The learned temporal and spatial representations from both streams are fused before reaching the output layer. Experimental results demonstrate an enhancement in classification performance and a reduction in standard deviation across multiple test subjects compared to other examined models.
