Table of Contents
Fetching ...

Investigating Brain Connectivity and Regional Statistics from EEG for early stage Parkinson's Classification

Amarpal Sahota, Amber Roguski, Matthew W Jones, Zahraa S. Abdallah, Raul Santos-Rodriguez

TL;DR

This study tackles early-stage Parkinson's disease detection from EEG by fusing brain connectivity metrics with regional frequency-based statistics across wakefulness and sleep stages. Using AdaBoost, it evaluates nine connectivity measures and demonstrates that the Phase Lag Index (PLI) in the gamma band yields the best single-state performance on N1 data (86.2%), while a feature-level fusion of PLI with regional statistics achieves the best overall performance (91.3% on N1) with high precision (96%) and recall (80%). The dataset comprises 30 participants (11 PD, 19 HC) across five arousal states, and cross-state fusion consistently improves performance over either modality alone. The findings suggest that combining connectivity and regional EEG statistics provides complementary information that enhances early PD classification, with N1 sleep emerging as a particularly informative state for discrimination.

Abstract

We evaluate the effectiveness of combining brain connectivity metrics with signal statistics for early stage Parkinson's Disease (PD) classification using electroencephalogram data (EEG). The data is from 5 arousal states - wakeful and four sleep stages (N1, N2, N3 and REM). Our pipeline uses an Ada Boost model for classification on a challenging early stage PD classification task with with only 30 participants (11 PD , 19 Healthy Control). Evaluating 9 brain connectivity metrics we find the best connectivity metric to be different for each arousal state with Phase Lag Index achieving the highest individual classification accuracy of 86\% on N1 data. Further to this our pipeline using regional signal statistics achieves an accuracy of 78\%, using brain connectivity only achieves an accuracy of 86\% whereas combining the two achieves a best accuracy of 91\%. This best performance is achieved on N1 data using Phase Lag Index (PLI) combined with statistics derived from the frequency characteristics of the EEG signal. This model also achieves a recall of 80 \% and precision of 96\%. Furthermore we find that on data from each arousal state, combining PLI with regional signal statistics improves classification accuracy versus using signal statistics or brain connectivity alone. Thus we conclude that combining brain connectivity statistics with regional EEG statistics is optimal for classifier performance on early stage Parkinson's. Additionally, we find outperformance of N1 EEG for classification of Parkinson's and expect this could be due to disrupted N1 sleep in PD. This should be explored in future work.

Investigating Brain Connectivity and Regional Statistics from EEG for early stage Parkinson's Classification

TL;DR

This study tackles early-stage Parkinson's disease detection from EEG by fusing brain connectivity metrics with regional frequency-based statistics across wakefulness and sleep stages. Using AdaBoost, it evaluates nine connectivity measures and demonstrates that the Phase Lag Index (PLI) in the gamma band yields the best single-state performance on N1 data (86.2%), while a feature-level fusion of PLI with regional statistics achieves the best overall performance (91.3% on N1) with high precision (96%) and recall (80%). The dataset comprises 30 participants (11 PD, 19 HC) across five arousal states, and cross-state fusion consistently improves performance over either modality alone. The findings suggest that combining connectivity and regional EEG statistics provides complementary information that enhances early PD classification, with N1 sleep emerging as a particularly informative state for discrimination.

Abstract

We evaluate the effectiveness of combining brain connectivity metrics with signal statistics for early stage Parkinson's Disease (PD) classification using electroencephalogram data (EEG). The data is from 5 arousal states - wakeful and four sleep stages (N1, N2, N3 and REM). Our pipeline uses an Ada Boost model for classification on a challenging early stage PD classification task with with only 30 participants (11 PD , 19 Healthy Control). Evaluating 9 brain connectivity metrics we find the best connectivity metric to be different for each arousal state with Phase Lag Index achieving the highest individual classification accuracy of 86\% on N1 data. Further to this our pipeline using regional signal statistics achieves an accuracy of 78\%, using brain connectivity only achieves an accuracy of 86\% whereas combining the two achieves a best accuracy of 91\%. This best performance is achieved on N1 data using Phase Lag Index (PLI) combined with statistics derived from the frequency characteristics of the EEG signal. This model also achieves a recall of 80 \% and precision of 96\%. Furthermore we find that on data from each arousal state, combining PLI with regional signal statistics improves classification accuracy versus using signal statistics or brain connectivity alone. Thus we conclude that combining brain connectivity statistics with regional EEG statistics is optimal for classifier performance on early stage Parkinson's. Additionally, we find outperformance of N1 EEG for classification of Parkinson's and expect this could be due to disrupted N1 sleep in PD. This should be explored in future work.
Paper Structure (17 sections, 5 figures, 4 tables)

This paper contains 17 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: EEG channels grouped into 13 brain regions of interest regionaleegsahota2023interpretable.
  • Figure 2: Pipeline for signal statistics classifier.
  • Figure 3: Example connectivity grid with electrodes (channels) along the x and y axes. Grid is filled with connectivity value for each electrode-electrode pair. Colour gradient fill with blue as 0 and dark red as 1.
  • Figure 4: Pipeline for connectivity classifier.
  • Figure 5: Confusion Matrix for highest accuracy model (91.3%). Model uses PLI features and regional statistical features from N1 data. Model run with two random seeds, therefore samples in matrix are double the number of total samples.