Deep Learning Architectures for Code-Modulated Visual Evoked Potentials Detection
Kiran Nair, Hubert Cecotti
TL;DR
This work addresses the challenge of robust single-trial C-VEP decoding for non-invasive BCIs. It systematically compares traditional correlation based methods and deep learning approaches, including CNNs and Siamese networks, using a 63-bit m-sequence stimulation paradigm with eight-channel EEG from 13 participants across five sessions. The results show deep architectures substantially outperform baselines, with the multi-classifier Siamese network achieving the highest average accuracy of 96.89%, and distance-based CNN decoding with Earth Mover's Distance offering strong resilience to latency variations; temporal augmentation further enhances cross-session generalization. The findings support the potential of end-to-end deep learning for reliable, calibration-light C-VEP BCIs and highlight pathways for improving subject variability and real-time deployment.
Abstract
Non-invasive Brain-Computer Interfaces (BCIs) based on Code-Modulated Visual Evoked Potentials (C-VEPs) require highly robust decoding methods to address temporal variability and session-dependent noise in EEG signals. This study proposes and evaluates several deep learning architectures, including convolutional neural networks (CNNs) for 63-bit m-sequence reconstruction and classification, and Siamese networks for similarity-based decoding, alongside canonical correlation analysis (CCA) baselines. EEG data were recorded from 13 healthy adults under single-target flicker stimulation. The proposed deep models significantly outperformed traditional approaches, with distance-based decoding using Earth Mover's Distance (EMD) and constrained EMD showing greater robustness to latency variations than Euclidean and Mahalanobis metrics. Temporal data augmentation with small shifts further improved generalization across sessions. Among all models, the multi-class Siamese network achieved the best overall performance with an average accuracy of 96.89%, demonstrating the potential of data-driven deep architectures for reliable, single-trial C-VEP decoding in adaptive non-invasive BCI systems.
