Advancing Brainwave-Based Biometrics: A Large-Scale, Multi-Session Evaluation
Matin Fallahi, Patricia Arias-Cabarcos, Thorsten Strufe
TL;DR
This work addresses the generalizability and long-term reliability of EEG-based biometrics by leveraging a large-scale public dataset (PEERS) with 345 subjects and 6,007 sessions across five years and three headsets. It shows that deep metric-learning pipelines (notably ResNet1D with SupConLoss and Euclidean comparison) outperform handcrafted features, while performance degrades over time unless enrollment data are refreshed. The study also demonstrates that meaningful authentication is feasible with consumer-grade EEG channel counts and cross-device training, but falls short of international biometric standards, highlighting the need for much larger training sets and improved architectures. The authors provide open-source code to enable reproducible research and encourage community-driven progress toward scalable, durable brainwave-based authentication.
Abstract
The field of brainwave-based biometrics has gained attention for its potential to revolutionize user authentication through hands-free interaction, resistance to shoulder surfing, continuous authentication, and revocability. However, current research often relies on single-session or limited-session datasets with fewer than 55 subjects, raising concerns about the generalizability of the findings. To address this gap, we conducted a large-scale study using a public brainwave dataset comprising 345 subjects and over 6,007 sessions (an average of 17 per subject) recorded over five years using three headsets. Our results reveal that deep learning approaches significantly outperform hand-crafted feature extraction methods. We also observe Equal Error Rates (EER) increases over time (e.g., from 6.7% after 1 day to 14.3% after a year). Therefore, it is necessary to reinforce the enrollment set after successful login attempts. Moreover, we demonstrate that fewer brainwave measurement sensors can be used, with an acceptable increase in EER, which is necessary for transitioning from medical-grade to affordable consumer-grade devices. Finally, we compared our results to prior work and existing biometric standards. While our performance is on par with or exceeds previous approaches, it still falls short of industrial benchmarks. Based on the results, we hypothesize that further improvements are possible with larger training sets. To support future research, we have open-sourced our analysis code.
