Table of Contents
Fetching ...

SSDLabeler: Realistic semi-synthetic data generation for multi-label artifact classification in EEG

Taketo Akama, Akima Connelly, Shun Minamikawa, Natalia Polouliakh

TL;DR

SSDLabeler, a framework that generates realistic, annotated SSDs by decomposing real EEG with ICA, epoch-level artifact verification using RMS and PSD criteria, and reinjecting multiple artifact types into clean data, establishes a scalable foundation for artifact handling that captures the co-occurrence and complexity of real EEG.

Abstract

EEG recordings are inherently contaminated by artifacts such as ocular, muscular, and environmental noise, which obscure neural activity and complicate preprocessing. Artifact classification offers advantages in stability and transparency, providing a viable alternative to ICA-based methods that enable flexible use alongside human inspections and across various applications. However, artifact classification is limited by its training data as it requires extensive manual labeling, which cannot fully cover the diversity of real-world EEG. Semi-synthetic data (SSD) methods have been proposed to address this limitation, but prior approaches typically injected single artifact types using ICA components or required separately recorded artifact signals, reducing both the realism of the generated data and the applicability of the method. To overcome these issues, we introduce SSDLabeler, a framework that generates realistic, annotated SSDs by decomposing real EEG with ICA, epoch-level artifact verification using RMS and PSD criteria, and reinjecting multiple artifact types into clean data. When applied to train a multi-label artifact classifier, it improved accuracy on raw EEG across diverse conditions compared to prior SSD and raw EEG training, establishing a scalable foundation for artifact handling that captures the co-occurrence and complexity of real EEG.

SSDLabeler: Realistic semi-synthetic data generation for multi-label artifact classification in EEG

TL;DR

SSDLabeler, a framework that generates realistic, annotated SSDs by decomposing real EEG with ICA, epoch-level artifact verification using RMS and PSD criteria, and reinjecting multiple artifact types into clean data, establishes a scalable foundation for artifact handling that captures the co-occurrence and complexity of real EEG.

Abstract

EEG recordings are inherently contaminated by artifacts such as ocular, muscular, and environmental noise, which obscure neural activity and complicate preprocessing. Artifact classification offers advantages in stability and transparency, providing a viable alternative to ICA-based methods that enable flexible use alongside human inspections and across various applications. However, artifact classification is limited by its training data as it requires extensive manual labeling, which cannot fully cover the diversity of real-world EEG. Semi-synthetic data (SSD) methods have been proposed to address this limitation, but prior approaches typically injected single artifact types using ICA components or required separately recorded artifact signals, reducing both the realism of the generated data and the applicability of the method. To overcome these issues, we introduce SSDLabeler, a framework that generates realistic, annotated SSDs by decomposing real EEG with ICA, epoch-level artifact verification using RMS and PSD criteria, and reinjecting multiple artifact types into clean data. When applied to train a multi-label artifact classifier, it improved accuracy on raw EEG across diverse conditions compared to prior SSD and raw EEG training, establishing a scalable foundation for artifact handling that captures the co-occurrence and complexity of real EEG.

Paper Structure

This paper contains 17 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: McNemar's test results for per-label classification performance across the three training datasets on the motor execution session test data. The top row (a) compares Raw EEG with Our SSD, the middle row (b) compares Raw EEG with the Previous Work's SSD, and the bottom row (c) compares the Previous Work's SSD with Our SSD. In all panels, columns correspond to artifact categories in the following order: Clean, Eye, Muscle, Heart, Line, Channel, and Other.
  • Figure 2: McNemar's test results for per-label classification performance across the three training datasets on the noise session test data. The top row (a) compares Raw EEG with Our SSD, the middle row (b) compares Raw EEG with the Previous Work's SSD, and the bottom row (c) compares the Previous Work's SSD with Our SSD. In all panels, columns correspond to artifact categories in the following order: Clean, Eye, Muscle, Heart, Line, Channel, and Other.
  • Figure 3: Preprocessing and annotation pipeline for semi-synthetic EEG generation. Raw EEG was first preprocessed with bandpass (1–50 Hz) and notch filtering (60 Hz), then epoched into three-second segments. ICA and ICLabel were then applied to decompose the signals and assign artifact probabilities to each IC. Artifact ICs with ICLabel scores $\geq 0.6$ (e.g., eye, muscle, heart, line, channel, or other) were reconstructed and segmented into time windows. RMS and PSD thresholding were subsequently applied to ensure the presence of true contamination within each segment. This pipeline enables reliable annotation of artifact-contaminated EEG segments, forming the basis for generating labeled, realistic, multi-label SSD.
  • Figure 4: RMS thresholding procedure used for artifact verification. For all artifact types except muscle, IC-derived signals were segmented into epochs and the RMS was computed for each segment. Epochs with RMS values exceeding the predefined threshold were labeled as artifact-present (1), whereas those below the threshold were labeled as clean (0). The example plot illustrates $RMS_{\text{eye}}$, where segments above the red dashed line are identified as eye-related artifacts.
  • Figure 5: PSD thresholding procedure for muscle artifact detection. The PSD of muscle component segments and correlation values were computed. Segments with correlation $\leq 0.8$ relative to the clean PSD template were classified as containing muscle artifacts (1), while those above the threshold were labeled as clean (0). A simulated muscle spectrum ($PSD_{\text{muscle}}$) was used as an example to show elevated high-frequency activity characteristic of muscle contamination.