Table of Contents
Fetching ...

Forecasting Epileptic Seizures from Contactless Camera via Cross-Species Transfer Learning

Mingkai Zhai, Wei Wang, Zongsheng Li, Quanying Liu

Abstract

Epileptic seizure forecasting is a clinically important yet challenging problem in epilepsy research. Existing approaches predominantly rely on neural signals such as electroencephalography (EEG), which require specialized equipment and limit long-term deployment in real-world settings. In contrast, video data provide a non-invasive and accessible alternative, yet existing video-based studies mainly focus on post-onset seizure detection, leaving seizure forecasting largely unexplored. In this work, we formulate a novel task of video-based epileptic seizure forecasting, where short pre-ictal video segments (3-10 seconds) are used to predict whether a seizure will occur within the subsequent 5 seconds. To address the scarcity of annotated human epilepsy videos, we propose a cross-species transfer learning framework that leverages large-scale rodent video data for auxiliary pretraining. This enables the model to capture seizure-related behavioral dynamics that generalize across species. Experimental results demonstrate that our approach achieves over 70% prediction accuracy under a strictly video-only setting and outperforms existing baselines. These findings highlight the potential of cross-species learning for building non-invasive, scalable early-warning systems for epilepsy.

Forecasting Epileptic Seizures from Contactless Camera via Cross-Species Transfer Learning

Abstract

Epileptic seizure forecasting is a clinically important yet challenging problem in epilepsy research. Existing approaches predominantly rely on neural signals such as electroencephalography (EEG), which require specialized equipment and limit long-term deployment in real-world settings. In contrast, video data provide a non-invasive and accessible alternative, yet existing video-based studies mainly focus on post-onset seizure detection, leaving seizure forecasting largely unexplored. In this work, we formulate a novel task of video-based epileptic seizure forecasting, where short pre-ictal video segments (3-10 seconds) are used to predict whether a seizure will occur within the subsequent 5 seconds. To address the scarcity of annotated human epilepsy videos, we propose a cross-species transfer learning framework that leverages large-scale rodent video data for auxiliary pretraining. This enables the model to capture seizure-related behavioral dynamics that generalize across species. Experimental results demonstrate that our approach achieves over 70% prediction accuracy under a strictly video-only setting and outperforms existing baselines. These findings highlight the potential of cross-species learning for building non-invasive, scalable early-warning systems for epilepsy.
Paper Structure (17 sections, 2 equations, 2 figures, 2 tables)

This paper contains 17 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The proposed two-stage framework for seizure forecasting. Stage 1: Epilepsy Domain-Specific Continual Pre-training. The VideoMAE model is pre-trained using a cross-species dataset (public rodents data and private human data) through a self-supervised reconstruction task. A tube masking strategy is applied, and the model is optimized using Mean Squared Error ($L_{MSE}$) between the original and reconstructed video clips. Stage 2: Seizure Forecasting. The pre-trained encoder weights are transferred to the forecasting model. Monitoring clips (3-10s) are processed by the encoder to generate hidden states, which are then fed into a classification head to predict the probability of seizure onset within a future window (e.g., 5 seconds)
  • Figure 2: Impact of mask ratio and pre-training data configurations on few-shot seizure detection performance. The four subplots display the balanced accuracy (bacc) achieved in 2-shot, 3-shot, 4-shot and average scenarios, respectively. In each plot, the x-axis represents the VideoMAE tube masking ratio ranging from 0.1 to 0.9. The different lines correspond to various pre-training data compositions as detailed in the "Configurations" legend, including human patients (+H), different rodent data subsets (+Rodents(Y), +Rodents(N), +Rodents(Y/N)), and the combined cross-species dataset (+Rodents(Y/N)+H).