ESTformer: Transformer utilising spatiotemporal dependencies for electroencephalogram super-resolution
Dongdong Li, Zhongliang Zeng, Zhe Wang, Hai Yang
TL;DR
ESTformer addresses the challenge of reconstructing high-resolution EEG data from lightweight, low-channel devices by leveraging a transformer-based framework that models spatiotemporal dependencies. It introduces a fixed-mask strategy and decomposes modeling into a Spatial Interpolation Module (SIM) and a Temporal Reconstruction Module (TRM) built from space-wise and time-wise self-attention, with 3D spatial and 1D temporal positional encodings. The approach achieves state-of-the-art SR performance on EEG datasets and improves downstream tasks such as person identification and emotion recognition, demonstrating its potential to enable practical, real-time EEG applications with fewer channels. The work highlights significant reductions in computation compared to 2D-CNN baselines and provides a principled pathway for deploying lightweight EEG systems without sacrificing fidelity.
Abstract
Towards practical applications of Electroencephalography (EEG), lightweight acquisition devices garner significant attention. However, EEG channel selection methods are commonly data-sensitive and cannot establish a unified sound paradigm for EEG acquisition devices. Through reverse conceptualisation, we formulated EEG applications in an EEG super-resolution (SR) manner, but suffered from high computation costs, extra interpolation bias, and few insights into spatiotemporal dependency modelling. To this end, we propose ESTformer, an EEG SR framework that utilises spatiotemporal dependencies based on the transformer. ESTformer applies positional encoding methods and a multihead self-attention mechanism to the space and time dimensions, which can learn spatial structural correlations and temporal functional variations. ESTformer, with the fixed mask strategy, adopts a mask token to upsample low-resolution (LR) EEG data in the case of disturbance from mathematical interpolation methods. On this basis, we designed various transformer blocks to construct a spatial interpolation module (SIM) and a temporal reconstruction module (TRM). Finally, ESTformer cascades the SIM and TRM to capture and model the spatiotemporal dependencies for EEG SR with fidelity. Extensive experimental results on two EEG datasets show the effectiveness of ESTformer against previous state-of-the-art methods, demonstrating the versatility of the Transformer for EEG SR tasks. The superiority of the SR data was verified in an EEG-based person identification and emotion recognition task, achieving a 2% to 38% improvement compared with the LR data at different sampling scales.
