Adversarial Artifact Detection in EEG-Based Brain-Computer Interfaces
Xiaoqing Chen, Dongrui Wu
TL;DR
Security concerns in EEG-based BCIs are highlighted and adversarial detection is motivated as the first systematic study in this domain, noting that small input perturbations can cause misclassification. The authors adapt three detection methods—Bayesian Uncertainty (BU), Local Intrinsic Dimensionality (LID), and Mahalanobis distance-based confidence (MD_max)—and evaluate them against FGSM, PGD, CW, and transferability-based black-box attacks on two EEG datasets using three CNNs, with features drawn from the network's last layer. Findings show that white-box attacks are easier to detect than black-box ones; LID generally provides the strongest detection, while CW attacks are harder to detect; overall detection achieves meaningful AUC values across settings. The results support reactive adversarial detection as a viable defense for EEG BCIs and guide future work on combining detectors, expanding attack coverage, and integrating proactive defenses to enhance safety.
Abstract
Machine learning has achieved great success in electroencephalogram (EEG) based brain-computer interfaces (BCIs). Most existing BCI research focused on improving its accuracy, but few had considered its security. Recent studies, however, have shown that EEG-based BCIs are vulnerable to adversarial attacks, where small perturbations added to the input can cause misclassification. Detection of adversarial examples is crucial to both the understanding of this phenomenon and the defense. This paper, for the first time, explores adversarial detection in EEG-based BCIs. Experiments on two EEG datasets using three convolutional neural networks were performed to verify the performances of multiple detection approaches. We showed that both white-box and black-box attacks can be detected, and the former are easier to detect.
