Table of Contents
Fetching ...

Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks

Izhan Fakhruzi, Manuel Titos, Carmen Benítez, Luz García

Abstract

Effective urban traffic monitoring is essential for improving mobility, enhancing safety, and supporting sustainable cities. Distributed Acoustic Sensing (DAS) enables large-scale traffic observation by transforming existing fiber-optic infrastructure into dense arrays of vibration sensors. However, modeling the high-resolution spatio-temporal structure of DAS data for reliable traffic event recognition remains challenging. This study presents a real-world DAS-based traffic monitoring experiment conducted in Granada, Spain, where vehicles cross a fiber deployed perpendicular to the roadway. Recurrent neural networks (RNNs) are employed to model intra- and inter-event temporal dependencies. Spatial and temporal attention mechanisms are systematically integrated within the RNN architecture to analyze their impact on recognition performance, parameter efficiency, and interpretability. Results show that an appropriate and complementary placement of attention modules improves the balance between accuracy and model complexity. Attention heatmaps provide physically meaningful interpretations of classification decisions by highlighting informative spatial locations and temporal segments. Furthermore, the proposed SA-bi-TA configuration demonstrates spatial transferability, successfully recognizing traffic events at sensing locations different from those used during training, with only moderate performance degradation. These findings support the development of scalable and interpretable DAS-based traffic monitoring systems capable of operating under heterogeneous urban sensing conditions.

Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks

Abstract

Effective urban traffic monitoring is essential for improving mobility, enhancing safety, and supporting sustainable cities. Distributed Acoustic Sensing (DAS) enables large-scale traffic observation by transforming existing fiber-optic infrastructure into dense arrays of vibration sensors. However, modeling the high-resolution spatio-temporal structure of DAS data for reliable traffic event recognition remains challenging. This study presents a real-world DAS-based traffic monitoring experiment conducted in Granada, Spain, where vehicles cross a fiber deployed perpendicular to the roadway. Recurrent neural networks (RNNs) are employed to model intra- and inter-event temporal dependencies. Spatial and temporal attention mechanisms are systematically integrated within the RNN architecture to analyze their impact on recognition performance, parameter efficiency, and interpretability. Results show that an appropriate and complementary placement of attention modules improves the balance between accuracy and model complexity. Attention heatmaps provide physically meaningful interpretations of classification decisions by highlighting informative spatial locations and temporal segments. Furthermore, the proposed SA-bi-TA configuration demonstrates spatial transferability, successfully recognizing traffic events at sensing locations different from those used during training, with only moderate performance degradation. These findings support the development of scalable and interpretable DAS-based traffic monitoring systems capable of operating under heterogeneous urban sensing conditions.
Paper Structure (30 sections, 2 equations, 8 figures, 8 tables)

This paper contains 30 sections, 2 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: General attention module (adapted from Brauwers2023).
  • Figure 2: 2.5 km of fiber, the red triangles represent the locations of the fiber where the registered data are collected and used in this study. (1) Palacio de Congresos; (2) Acera del Darro crossroads. (Courtesy of Google Maps)
  • Figure 3: (Left) Schematic overview of fiber laid perpendicular to the lanes. Identified SPs on the fiber is denoted by #1, #2, ..., #n. (Right) DAS registers of a 10-minute long signals with y-axis corresponds to the SPs. The fiber layout is depicted for: (a) Palacio de Congresos with 2 lanes, 1 direction, and 3 identified SPs; (b) Acera del Darro with 4 lanes, 2 directions, and 7 identified SPs, devided into 3 Groups of SPs: A(#1--#3), B(#3--#5), C(#5--#7).
  • Figure 4: Model comparison in accuracy, F1-score, and standard deviation (whisker) across 5-fold cross-validation. Detail for the left-side y-axis range between 50%--90%. Models are sorted from left to right based on increasing number of trainable parameters (secondary y-axis) and trained with and without temporal derivatives (+$\Delta$). (a) Baseline models (blue), (b) Single attention models (green), and (c) Cascade spatio-temporal attention models (red).
  • Figure 5: Two representative architectures from the ablation study: (a) SA-bi-TA, (b) bi-TA-SA.
  • ...and 3 more figures