Table of Contents
Fetching ...

HyMAD: A Hybrid Multi-Activity Detection Approach for Border Surveillance and Monitoring

Sriram Srinivasan, Srinivasan Aruchamy, Siva Ram Krisha Vadali

TL;DR

HyMAD tackles the challenge of multi-label seismic event detection under overlapping human, animal, and vehicle activities for border surveillance. It introduces a hybrid architecture that fuses learnable frequency features from SincNet with RNN-based temporal encoding, using self-attention per modality and cross-attention fusion to disentangle concurrent events. The approach demonstrates competitive performance on real-field seismic data and shows strong generalization to complex overlaps while offering a modular framework for extension. This work advances seismic signal analysis toward robust, real-time monitoring in security applications.

Abstract

Seismic sensing has emerged as a promising solution for border surveillance and monitoring; the seismic sensors that are often buried underground are small and cannot be noticed easily, making them difficult for intruders to detect, avoid, or vandalize. This significantly enhances their effectiveness compared to highly visible cameras or fences. However, accurately detecting and distinguishing between overlapping activities that are happening simultaneously, such as human intrusions, animal movements, and vehicle rumbling, remains a major challenge due to the complex and noisy nature of seismic signals. Correctly identifying simultaneous activities is critical because failing to separate them can lead to misclassification, missed detections, and an incomplete understanding of the situation, thereby reducing the reliability of surveillance systems. To tackle this problem, we propose HyMAD (Hybrid Multi-Activity Detection), a deep neural architecture based on spatio-temporal feature fusion. The framework integrates spectral features extracted with SincNet and temporal dependencies modeled by a recurrent neural network (RNN). In addition, HyMAD employs self-attention layers to strengthen intra-modal representations and a cross-modal fusion module to achieve robust multi-label classification of seismic events. e evaluate our approach on a dataset constructed from real-world field recordings collected in the context of border surveillance and monitoring, demonstrating its ability to generalize to complex, simultaneous activity scenarios involving humans, animals, and vehicles. Our method achieves competitive performance and offers a modular framework for extending seismic-based activity recognition in real-world security applications.

HyMAD: A Hybrid Multi-Activity Detection Approach for Border Surveillance and Monitoring

TL;DR

HyMAD tackles the challenge of multi-label seismic event detection under overlapping human, animal, and vehicle activities for border surveillance. It introduces a hybrid architecture that fuses learnable frequency features from SincNet with RNN-based temporal encoding, using self-attention per modality and cross-attention fusion to disentangle concurrent events. The approach demonstrates competitive performance on real-field seismic data and shows strong generalization to complex overlaps while offering a modular framework for extension. This work advances seismic signal analysis toward robust, real-time monitoring in security applications.

Abstract

Seismic sensing has emerged as a promising solution for border surveillance and monitoring; the seismic sensors that are often buried underground are small and cannot be noticed easily, making them difficult for intruders to detect, avoid, or vandalize. This significantly enhances their effectiveness compared to highly visible cameras or fences. However, accurately detecting and distinguishing between overlapping activities that are happening simultaneously, such as human intrusions, animal movements, and vehicle rumbling, remains a major challenge due to the complex and noisy nature of seismic signals. Correctly identifying simultaneous activities is critical because failing to separate them can lead to misclassification, missed detections, and an incomplete understanding of the situation, thereby reducing the reliability of surveillance systems. To tackle this problem, we propose HyMAD (Hybrid Multi-Activity Detection), a deep neural architecture based on spatio-temporal feature fusion. The framework integrates spectral features extracted with SincNet and temporal dependencies modeled by a recurrent neural network (RNN). In addition, HyMAD employs self-attention layers to strengthen intra-modal representations and a cross-modal fusion module to achieve robust multi-label classification of seismic events. e evaluate our approach on a dataset constructed from real-world field recordings collected in the context of border surveillance and monitoring, demonstrating its ability to generalize to complex, simultaneous activity scenarios involving humans, animals, and vehicles. Our method achieves competitive performance and offers a modular framework for extending seismic-based activity recognition in real-world security applications.

Paper Structure

This paper contains 29 sections, 14 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Overall architecture of the proposed Hybrid Multi-Activity Detection (HyMAD) model. The raw seismic signal is processed through parallel frequency and temporal feature extraction paths, which are then integrated via a cross-attention mechanism before classification.
  • Figure 2: Geophone used in the experimental setup.
  • Figure 3: Data acquisition system used in the experimental setup.
  • Figure 4: t-SNE visualization of feature embeddings.
  • Figure 5: Comparison of ROC (left) and PR (right) curves for the proposed HyMAD model.