Optimising EEG decoding with refined sampling and multimodal feature integration
Arash Akbarinia
TL;DR
This work tackles EEG-based object decoding by aligning EEG encoder outputs with multimodal pretrained features through contrastive learning. It introduces InterDimensional EEG Sampling (IDES) to expand the training space and boost SNR, and couples visual features with language features derived from BLIP captions to form a richer multimodal target for EEG alignment. Evaluated on the THINGS EEG2 dataset, the approach achieves substantial gains over state-of-the-art baselines, with ~7% higher Top-1 accuracy in intraparticipant settings and strong generalization signals, notably when using Laion-400M CLIP features and observing a correlation with ImageNet-O/A generalization power. The findings suggest that refined sampling and multimodal feature integration can meaningfully enhance EEG decoding and potentially generalize to other neuroimaging modalities, while remaining mindful of computational costs and broader societal implications.
Abstract
Electroencephalography (EEG) is a neuroimaging technique that records brain neural activity with high temporal resolution. Unlike other methods, EEG does not require prohibitively expensive equipment and can be easily set up using commercially available portable EEG caps, making it an ideal candidate for brain-computer interfaces. However, EEG signals are characterised by poor spatial resolution and high noise levels, complicating their decoding. In this study, we employ a contrastive learning framework to align encoded EEG features with pretrained CLIP features, achieving a 7% improvement over the state-of-the-art in EEG decoding of object categories. This enhancement is equally attributed to (1) a novel online sampling method that boosts the signal-to-noise ratio and (2) multimodal representations leveraging visual and language features to enhance the alignment space. Our analysis reveals a systematic interaction between the architecture and dataset of pretrained features and their alignment efficacy for EEG signal decoding. This interaction correlates with the generalisation power of the pretrained features on ImageNet-O/A datasets ($r=.5$). These findings extend beyond EEG signal alignment, offering potential for broader applications in neuroimaging decoding and generic feature alignments.
