Table of Contents
Fetching ...

Bayesian uncertainty-aware deep learning with noisy labels: Tackling annotation ambiguity in EEG seizure detection

Deeksha M. Shama, Archana Venkataraman

TL;DR

Bayesian UncertaiNty-aware Deep Learning (BUNDL), a novel algorithm that informs a deep learning model of label ambiguities, thereby enhancing the robustness of seizure detection systems and offering a straightforward and model-agnostic method for training deep neural networks with noisy training labels that does not add any parameters to existing architectures.

Abstract

Deep learning is advancing EEG processing for automated epileptic seizure detection and onset zone localization, yet its performance relies heavily on high-quality annotated training data. However, scalp EEG is susceptible to high noise levels, which in turn leads to imprecise annotations of the seizure timing and characteristics. This "label noise" presents a significant challenge in model training and generalization. In this paper, we introduce Bayesian UncertaiNty-aware Deep Learning (BUNDL), a novel algorithm that informs a deep learning model of label ambiguities, thereby enhancing the robustness of seizure detection systems. By integrating domain knowledge into an underlying Bayesian framework, we derive a novel KL-divergence-based loss function that capitalizes on uncertainty to better learn seizure characteristics from scalp EEG. Thus, BUNDL offers a straightforward and model-agnostic method for training deep neural networks with noisy training labels that does not add any parameters to existing architectures. Additionally, we explore the impact of improved detection system on the task of automated onset zone localization. We validate BUNDL using a comprehensive simulated EEG dataset and two publicly available datasets collected by Temple University Hospital (TUH) and Boston Children's Hospital (CHB-MIT). Results show that BUNDL consistently identifies noisy labels and improves the robustness of three base models under various label noise conditions. We also evaluate cross-site generalizability and quantify computational cost of all methods. Ultimately, BUNDL presents as a reliable method that can be seamlessly integrated with existing deep models used in clinical practice, enabling the training of trustworthy models for epilepsy evaluation.

Bayesian uncertainty-aware deep learning with noisy labels: Tackling annotation ambiguity in EEG seizure detection

TL;DR

Bayesian UncertaiNty-aware Deep Learning (BUNDL), a novel algorithm that informs a deep learning model of label ambiguities, thereby enhancing the robustness of seizure detection systems and offering a straightforward and model-agnostic method for training deep neural networks with noisy training labels that does not add any parameters to existing architectures.

Abstract

Deep learning is advancing EEG processing for automated epileptic seizure detection and onset zone localization, yet its performance relies heavily on high-quality annotated training data. However, scalp EEG is susceptible to high noise levels, which in turn leads to imprecise annotations of the seizure timing and characteristics. This "label noise" presents a significant challenge in model training and generalization. In this paper, we introduce Bayesian UncertaiNty-aware Deep Learning (BUNDL), a novel algorithm that informs a deep learning model of label ambiguities, thereby enhancing the robustness of seizure detection systems. By integrating domain knowledge into an underlying Bayesian framework, we derive a novel KL-divergence-based loss function that capitalizes on uncertainty to better learn seizure characteristics from scalp EEG. Thus, BUNDL offers a straightforward and model-agnostic method for training deep neural networks with noisy training labels that does not add any parameters to existing architectures. Additionally, we explore the impact of improved detection system on the task of automated onset zone localization. We validate BUNDL using a comprehensive simulated EEG dataset and two publicly available datasets collected by Temple University Hospital (TUH) and Boston Children's Hospital (CHB-MIT). Results show that BUNDL consistently identifies noisy labels and improves the robustness of three base models under various label noise conditions. We also evaluate cross-site generalizability and quantify computational cost of all methods. Ultimately, BUNDL presents as a reliable method that can be seamlessly integrated with existing deep models used in clinical practice, enabling the training of trustworthy models for epilepsy evaluation.

Paper Structure

This paper contains 36 sections, 5 equations, 7 figures, 7 tables, 2 algorithms.

Figures (7)

  • Figure 1: Overall pipeline of BUNDL.
  • Figure 2: Simulated EEG and label noise generation pipeline using SEREEGA krol2018sereega. The top half shows components in source domain including source positions and five types of signals assigned to corresponding source indicating with following icons: background is shown in blue circles, the spike noise in purple triangle, and seizure source in red diamond. The background sources are fixed in position while the latter two is randomly assigned per simulated participant. The bottom half shows the scalp electrode plot of 10-20 montage and an example EEG with true seizure and noisy annotations marked.
  • Figure 3: Loss curves (a) from training and (b) from validation folds of all simulated experiments of training CNN with BUNDL.
  • Figure 4: Box plots of performance metrics of three models trained using various strategies across five label noise types. Results are organized by deep network across columns - (left) DeepSOZ, (middle) TGCN, and (right) CNN - and by evaluation metric across rows: AUROC, AUPRC, FPR (min/hour), Sensitivity, and Latency (seconds). Each subfigure presents performance from BUNDL alongside three baseline comparisons at various label noise cases of rand-symmetric noise, over 0.1-over-segmented at 10%, over 0.3-oversegmetned at 30%, unner 0.1-under-segmented at 10%, under 0.3-undersegmetned at 30%
  • Figure 5: Seizure detection performance on simulated data for different uncertainty quantification methods using the DeepSOZ architecture. Multiple metrics (AUROC, AUPRC, FPR min/hour, sensitivity, and latency in seconds) are shown. Three types of label noise settings are considered: rand - randomized symmetric, over 0.3 - 30% over-segmentation, and under 0.3 - 30% over-segmentation of seizures.
  • ...and 2 more figures