Algorithmic Clustering based on String Compression to Extract P300 Structure in EEG Signals
Guillermo Sarasa, Ana Granados, Francisco B Rodríguez
TL;DR
The paper tackles robust P300 ERP identification in EEG despite inter-subject and temporal variability by applying compression-based clustering to ASCII-transformed EEG objects. It introduces a signal-to-ASCII pipeline and evaluates two clustering approaches (CompLearn's minimum quartet tree and Multidimensional Projections via PEx) on BCIs IIb and III datasets, demonstrating that NCD-driven clustering can reveal P300 structure and inform electrode selection. Key findings show that object construction parameters (M and C) strongly influence clustering quality, with optimal configurations yielding separable P300 and non-P300 groups and results consistent with the literature on ERP localization. Overall, the method provides a complementary, parameter-light tool for EEG analysis and P300 detection in BCIs, robust to variability and applicable to electrode-space exploration.
Abstract
P300 is an Event-Related Potential widely used in Brain-Computer Interfaces, but its detection is challenging due to inter-subject and temporal variability. This work introduces a clustering methodology based on Normalized Compression Distance (NCD) to extract the P300 structure, ensuring robustness against variability. We propose a novel signal-to-ASCII transformation to generate compression-friendly objects, which are then clustered using a hierarchical tree-based method and a multidimensional projection approach. Experimental results on two datasets demonstrate the method's ability to reveal relevant P300 structures, showing clustering performance comparable to state-of-the-art approaches. Furthermore, analysis at the electrode level suggests that the method could assist in electrode selection for P300 detection. This compression-driven clustering methodology offers a complementary tool for EEG analysis and P300 identification.
