Table of Contents
Fetching ...

Algorithmic Clustering based on String Compression to Extract P300 Structure in EEG Signals

Guillermo Sarasa, Ana Granados, Francisco B Rodríguez

TL;DR

The paper tackles robust P300 ERP identification in EEG despite inter-subject and temporal variability by applying compression-based clustering to ASCII-transformed EEG objects. It introduces a signal-to-ASCII pipeline and evaluates two clustering approaches (CompLearn's minimum quartet tree and Multidimensional Projections via PEx) on BCIs IIb and III datasets, demonstrating that NCD-driven clustering can reveal P300 structure and inform electrode selection. Key findings show that object construction parameters (M and C) strongly influence clustering quality, with optimal configurations yielding separable P300 and non-P300 groups and results consistent with the literature on ERP localization. Overall, the method provides a complementary, parameter-light tool for EEG analysis and P300 detection in BCIs, robust to variability and applicable to electrode-space exploration.

Abstract

P300 is an Event-Related Potential widely used in Brain-Computer Interfaces, but its detection is challenging due to inter-subject and temporal variability. This work introduces a clustering methodology based on Normalized Compression Distance (NCD) to extract the P300 structure, ensuring robustness against variability. We propose a novel signal-to-ASCII transformation to generate compression-friendly objects, which are then clustered using a hierarchical tree-based method and a multidimensional projection approach. Experimental results on two datasets demonstrate the method's ability to reveal relevant P300 structures, showing clustering performance comparable to state-of-the-art approaches. Furthermore, analysis at the electrode level suggests that the method could assist in electrode selection for P300 detection. This compression-driven clustering methodology offers a complementary tool for EEG analysis and P300 identification.

Algorithmic Clustering based on String Compression to Extract P300 Structure in EEG Signals

TL;DR

The paper tackles robust P300 ERP identification in EEG despite inter-subject and temporal variability by applying compression-based clustering to ASCII-transformed EEG objects. It introduces a signal-to-ASCII pipeline and evaluates two clustering approaches (CompLearn's minimum quartet tree and Multidimensional Projections via PEx) on BCIs IIb and III datasets, demonstrating that NCD-driven clustering can reveal P300 structure and inform electrode selection. Key findings show that object construction parameters (M and C) strongly influence clustering quality, with optimal configurations yielding separable P300 and non-P300 groups and results consistent with the literature on ERP localization. Overall, the method provides a complementary, parameter-light tool for EEG analysis and P300 detection in BCIs, robust to variability and applicable to electrode-space exploration.

Abstract

P300 is an Event-Related Potential widely used in Brain-Computer Interfaces, but its detection is challenging due to inter-subject and temporal variability. This work introduces a clustering methodology based on Normalized Compression Distance (NCD) to extract the P300 structure, ensuring robustness against variability. We propose a novel signal-to-ASCII transformation to generate compression-friendly objects, which are then clustered using a hierarchical tree-based method and a multidimensional projection approach. Experimental results on two datasets demonstrate the method's ability to reveal relevant P300 structures, showing clustering performance comparable to state-of-the-art approaches. Furthermore, analysis at the electrode level suggests that the method could assist in electrode selection for P300 detection. This compression-driven clustering methodology offers a complementary tool for EEG analysis and P300 identification.

Paper Structure

This paper contains 16 sections, 3 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: Scheme of a BCI stimuli-based-on system. The signal is recorded by means of certain acquisition technique, for instance the EEG recording. Then the digital signal obtained is processed (filtered + feature extraction + classification) in order to be parsed into commands. Finally, this extracted information (the commands) will be send through an interface with the different systems or devices to do the desired actions (system A and/or system B in figure). These systems can be for instance a commanded wheelchair, a BCI speller to send messages, a commanded robotic arm, etc.
  • Figure 2: Scheme of six P300-ERPs taken from the II BCI Competition problem 2b (speller matrix) dataset. Each interval, of 600ms, was taken after a stimulus was shown to the subject and the intensified row or column belong to the target character. The P300-ERP should manifest an increment of amplitude around 300ms after the stimulus.
  • Figure 3: Scheme of a P300 BCI system. The subject pays attention to a speller matrix program in a screen, in which the different columns and rows intensifies randomly. At the same time, the brain activity is recorded through EEG. As a result of row or column intensification of the focused character (marked as * in figure \ref{['fig:data-structure']}) the signal should increment its amplitude. This increment appears after 300ms and does not last longer that 600ms.
  • Figure 4: Structure used in the recording session showed in figure \ref{['fig:system']}. This figure is a reduced but more detailed version of figure \ref{['fig:6p300']}. Each numbered source correspond to each stimulus (rows 1-6, and columns 7-12) of the speller matrix, where each pulse corresponds to an intensification of that row or column. Once the infrequent stimulus appears, an amplitude increment should appear in the recorded signal 300 ms. The change generated by the ERP should not continue 600ms after the stimulus. After the marked row and column should manifest a P300-ERP in the figure. In this case, the subject was paying attention to the character "E" according to figure \ref{['fig:system']}.
  • Figure 5: Scheme followed in the object definition. For each intensification of the two target stimuli (row-column of the target character that should contain a P300-ERP), our system creates a first object (A) (which will be stored in a common set). Then, all these objects were shuffled and grouped together in subsets of $M$ elements in order to average each one of them, and compose the second objects (B). Finally, the group process is repeated to concatenate each subset in groups of $C$ objects into the final objects (C). In this figure $M = 4$ and $C = 2$. This process is repeated for the non-P300 objects as well.
  • ...and 12 more figures