Table of Contents
Fetching ...

Adversarial Spatio-Temporal Attention Networks for Epileptic Seizure Forecasting

Zan Li, Kyongmin Yeo, Wesley Gifford, Lara Marcuse, Madeline Fields, Bülent Yener

TL;DR

This work tackles epileptic seizure forecasting from multivariate EEG by introducing STAN, a cascaded Adversarial Spatio-Temporal Attention Network that jointly models spatial connectivity and temporal dynamics. The approach employs three cascaded attention blocks with alternating spatial and temporal modules and a gradient-penalized adversarial discriminator to learn robust preictal representations from clearly defined 15-minute windows, enabling reliable alarms typically 15–45 minutes before onset. Empirical evaluation on CHB-MIT and MSSM datasets demonstrates state-of-the-art sensitivity with dramatically reduced false-alarm rates (e.g., 96.6% Sn, 0.011/h FDR on CHB-MIT; 94.2% Sn, 0.063/h FDR on MSSM) while maintaining edge-friendly efficiency (2.3M parameters, 45 ms latency, 180 MB). The framework supports real-time monitoring, subject-specific adaptability without individualized training, and broad applicability to healthcare time series forecasting, offering a practical path toward precision medicine in epilepsy management and beyond.

Abstract

Forecasting epileptic seizures from multivariate EEG signals represents a critical challenge in healthcare time series prediction, requiring high sensitivity, low false alarm rates, and subject-specific adaptability. We present STAN, an Adversarial Spatio-Temporal Attention Network that jointly models spatial brain connectivity and temporal neural dynamics through cascaded attention blocks with alternating spatial and temporal modules. Unlike existing approaches that assume fixed preictal durations or separately process spatial and temporal features, STAN captures bidirectional dependencies between spatial and temporal patterns through a unified cascaded architecture. Adversarial training with gradient penalty enables robust discrimination between interictal and preictal states learned from clearly defined 15-minute preictal windows. Continuous 90-minute pre-seizure monitoring reveals that the learned spatio-temporal attention patterns enable early detection: reliable alarms trigger at subject-specific times (typically 15-45 minutes before onset), reflecting the model's capacity to capture subtle preictal dynamics without requiring individualized training. Experiments on two benchmark EEG datasets (CHB-MIT scalp: 8 subjects, 46 events; MSSM intracranial: 4 subjects, 14 events) demonstrate state-of-the-art performance: 96.6% sensitivity with 0.011 false detections per hour and 94.2% sensitivity with 0.063 false detections per hour, respectively, while maintaining computational efficiency (2.3M parameters, 45 ms latency, 180 MB memory) for real-time edge deployment. Beyond epilepsy, the proposed framework provides a general paradigm for spatio-temporal forecasting in healthcare and other time series domains where individual heterogeneity and interpretability are crucial.

Adversarial Spatio-Temporal Attention Networks for Epileptic Seizure Forecasting

TL;DR

This work tackles epileptic seizure forecasting from multivariate EEG by introducing STAN, a cascaded Adversarial Spatio-Temporal Attention Network that jointly models spatial connectivity and temporal dynamics. The approach employs three cascaded attention blocks with alternating spatial and temporal modules and a gradient-penalized adversarial discriminator to learn robust preictal representations from clearly defined 15-minute windows, enabling reliable alarms typically 15–45 minutes before onset. Empirical evaluation on CHB-MIT and MSSM datasets demonstrates state-of-the-art sensitivity with dramatically reduced false-alarm rates (e.g., 96.6% Sn, 0.011/h FDR on CHB-MIT; 94.2% Sn, 0.063/h FDR on MSSM) while maintaining edge-friendly efficiency (2.3M parameters, 45 ms latency, 180 MB). The framework supports real-time monitoring, subject-specific adaptability without individualized training, and broad applicability to healthcare time series forecasting, offering a practical path toward precision medicine in epilepsy management and beyond.

Abstract

Forecasting epileptic seizures from multivariate EEG signals represents a critical challenge in healthcare time series prediction, requiring high sensitivity, low false alarm rates, and subject-specific adaptability. We present STAN, an Adversarial Spatio-Temporal Attention Network that jointly models spatial brain connectivity and temporal neural dynamics through cascaded attention blocks with alternating spatial and temporal modules. Unlike existing approaches that assume fixed preictal durations or separately process spatial and temporal features, STAN captures bidirectional dependencies between spatial and temporal patterns through a unified cascaded architecture. Adversarial training with gradient penalty enables robust discrimination between interictal and preictal states learned from clearly defined 15-minute preictal windows. Continuous 90-minute pre-seizure monitoring reveals that the learned spatio-temporal attention patterns enable early detection: reliable alarms trigger at subject-specific times (typically 15-45 minutes before onset), reflecting the model's capacity to capture subtle preictal dynamics without requiring individualized training. Experiments on two benchmark EEG datasets (CHB-MIT scalp: 8 subjects, 46 events; MSSM intracranial: 4 subjects, 14 events) demonstrate state-of-the-art performance: 96.6% sensitivity with 0.011 false detections per hour and 94.2% sensitivity with 0.063 false detections per hour, respectively, while maintaining computational efficiency (2.3M parameters, 45 ms latency, 180 MB memory) for real-time edge deployment. Beyond epilepsy, the proposed framework provides a general paradigm for spatio-temporal forecasting in healthcare and other time series domains where individual heterogeneity and interpretability are crucial.

Paper Structure

This paper contains 16 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: STAN architecture showing three cascaded attention networks (Attention Network_i, _j, _k) processing raw EEG input. Each network contains spatial (blue) and temporal (orange) attention modules with $H=4$ attention heads. The resulting $M=3$ spatial and temporal attention maps are aggregated via MLP and passed to the discriminator for adversarial training. STAN is pre-trained with MSE reconstruction loss to learn spatio-temporal representations.
  • Figure 2: Detailed architecture of spatial and temporal attention modules. (A) Spatial attention module employs time encoder (1D CNN) followed by spatial multi-head attention layer to capture inter-channel connectivity. (B) Temporal attention module uses spatial encoder followed by temporal multi-head attention to model temporal evolution. Both include residual connections (curved arrows) with tanh activation for training stability. The legend shows matrix multiplication symbol and attention map representations.
  • Figure 3: Discriminator architecture. (A) Main structure: 6 attention maps (3 spatial in blue, 3 temporal in orange) from cascaded networks are processed by feature extractors (2D convolution + flatten + dropout + linear layers + ReLU), aggregated via sum fusion, and passed through final linear layer with sigmoid activation to generate discriminator score. (B-C) Feature extractor structures for spatial and temporal attention maps employ 2D convolution, flatten, dropout, and linear layers with ReLU activation.
  • Figure 4: Real-time seizure forecasting showing discriminator scores with 30-second moving average filter, monitoring 90 minutes before onset. Top: recording-3 (chb01_03); Bottom: recording-26 (chb22_26). Both demonstrate consistent gradual shift from interictal (high scores) to preictal states (low scores), providing at least 15 minutes of intervention time. The horizontal dashed line indicates threshold $\tau=0.5$.