Decoding Natural Images from EEG for Object Recognition
Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, Xiaorong Gao
TL;DR
This work tackles decoding natural images from EEG for object recognition by introducing NICE, a self-supervised cross-modal framework that learns image representations from EEG via contrastive learning. It combines a temporal-spatial EEG encoder (TSConv) with plug-and-play spatial modules (self-attention and graph attention) and explores pre-trained image encoders to achieve cross-modal alignment, yielding notable zero-shot performance on a large 200-way task, including top-1 and top-5 metrics of $15.6\%$ and $42.8\%$ respectively in challenging settings. The authors provide extensive analyses of temporal, spatial, and spectral dynamics and show that the learned EEG representations capture plausible brain activity patterns in occipital and temporal regions, supporting biological plausibility. The work also highlights practical implications for neural decoding and brain-computer interfaces and releases code to facilitate future research.
Abstract
Electroencephalography (EEG) signals, known for convenient non-invasive acquisition but low signal-to-noise ratio, have recently gained substantial attention due to the potential to decode natural images. This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals, particularly for object recognition. The framework utilizes image and EEG encoders to extract features from paired image stimuli and EEG responses. Contrastive learning aligns these two modalities by constraining their similarity. With the framework, we attain significantly above-chance results on a comprehensive EEG-image dataset, achieving a top-1 accuracy of 15.6% and a top-5 accuracy of 42.8% in challenging 200-way zero-shot tasks. Moreover, we perform extensive experiments to explore the biological plausibility by resolving the temporal, spatial, spectral, and semantic aspects of EEG signals. Besides, we introduce attention modules to capture spatial correlations, providing implicit evidence of the brain activity perceived from EEG data. These findings yield valuable insights for neural decoding and brain-computer interfaces in real-world scenarios. The code will be released on https://github.com/eeyhsong/NICE-EEG.
