Table of Contents
Fetching ...

Influence of Video Dynamics on EEG-based Single-Trial Video Target Surveillance System

Heon-Gyu Kwak, Sung-Jin Kim, Hyeon-Taek Han, Ji-Hoon Jeong, Seong-Whan Lee

TL;DR

The paper addresses the challenge of limited hostile-target data in video surveillance by proposing an EEG-based single-trial target detection framework to complement computer-vision systems. It introduces a hierarchical DeepConvNet architecture with three-class outputs (non-target, true-target, error-target), trained with data augmentation and subject-specific calibration. In online asynchronous experiments, it achieves a mean macro F-beta of 0.6522 on Video1, while performance drops for videos with dynamic camera movement and weather, suggesting reliance on passive visual features driven by stimulus dynamics. ERP analysis shows no strong discriminative temporal patterns, and saliency maps point to central/occipital channels, indicating the model leverages visual-perception cues; the study highlights the need for careful stimulus design to realize robust EEG-based surveillance augmentation.

Abstract

Target detection models are one of the widely used deep learning-based applications for reducing human efforts on video surveillance and patrol. However, the application of conventional computer vision-based target detection models in military usage can result in limited performance, due to the lack of sample data of hostile targets. In this paper, we present the possibility of the electroencephalography-based video target detection model, which could be applied as a supportive module of the military video surveillance system. The proposed framework and detection model showed prospective performance achieving a mean macro F-beta of 0.6522 with asynchronous real-time data from five subjects, in a certain video stimulus, but not on some video stimuli. By analyzing the results of experiments using each video stimulus, we present the factors that would affect the performance of electroencephalography-based video target detection models.

Influence of Video Dynamics on EEG-based Single-Trial Video Target Surveillance System

TL;DR

The paper addresses the challenge of limited hostile-target data in video surveillance by proposing an EEG-based single-trial target detection framework to complement computer-vision systems. It introduces a hierarchical DeepConvNet architecture with three-class outputs (non-target, true-target, error-target), trained with data augmentation and subject-specific calibration. In online asynchronous experiments, it achieves a mean macro F-beta of 0.6522 on Video1, while performance drops for videos with dynamic camera movement and weather, suggesting reliance on passive visual features driven by stimulus dynamics. ERP analysis shows no strong discriminative temporal patterns, and saliency maps point to central/occipital channels, indicating the model leverages visual-perception cues; the study highlights the need for careful stimulus design to realize robust EEG-based surveillance augmentation.

Abstract

Target detection models are one of the widely used deep learning-based applications for reducing human efforts on video surveillance and patrol. However, the application of conventional computer vision-based target detection models in military usage can result in limited performance, due to the lack of sample data of hostile targets. In this paper, we present the possibility of the electroencephalography-based video target detection model, which could be applied as a supportive module of the military video surveillance system. The proposed framework and detection model showed prospective performance achieving a mean macro F-beta of 0.6522 with asynchronous real-time data from five subjects, in a certain video stimulus, but not on some video stimuli. By analyzing the results of experiments using each video stimulus, we present the factors that would affect the performance of electroencephalography-based video target detection models.
Paper Structure (18 sections, 2 equations, 4 figures, 2 tables)

This paper contains 18 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: An overview of the EEG data acquisition process. A subject first watches 'Video1' twice, and then watches 'Video2-N' and 'Video2-AI' twice in turn. Each video clip is eight minutes long. The subject takes rests for two minutes after a session ends. Each subject proceeds four sessions of EEG acquisition, taking 52 minutes in total.
  • Figure 2: An example of an EEG data acquiring session with surveillance video clips. Subjects were instructed to concentrate on the presented video clip to detect appearing targets. EEG signals were recorded using 32-channel electrodes, in the soundproof experimental booth with lights off.
  • Figure 3: Grand average of ERPs for subject 10, 11, 12, and 13 by class, with EEG data of 'Video2-N'. The figure contains the ERPs for three seconds of duration after the target appearance. The red vertical line indicates the time point of the target appearance. The black plot is for the ERPs of non-target, the red plot is for true-targets (enemy soldier), blue plot is for error-targets, and green plot is for camera rotations
  • Figure 4: Channel-wise saliency map of subject 10, 11, 12, and 13 with EEG data of 'Video2-AI'. Deep blue colored areas indicate that EEG channels located in the area are relatively more important than other channels in model inference. Red colored areas indicates vice versa.