Table of Contents
Fetching ...

On Learning-Based Traffic Monitoring With a Swarm of Drones

Marko Maljkovic, Nikolas Geroliminis

TL;DR

This work tackles adaptive urban traffic monitoring with a swarm of low-cost drones by formulating a semi-decentralized reinforcement learning framework that trains a single universal Q-function $Q(\mathbf{s},\mathbf{u};\theta)$ to govern per-drone decisions. The method combines a grid-based state representation, a temporal-importance transformation, and an idleness-driven patrolling objective to balance exploration and re-observation of high-importance areas, enabling scalable online adaptation. Empirical results in artificial environments show the RL-based swarm outperforms random, greedy, and sweeping strategies in cumulative patrolling and coverage, and a Shenzhen case study demonstrates successful sim-to-real transfer on real traffic data. The approach offers a robust, scalable solution that remains effective under partial coordination and points to future enhancements with recurrent models and richer low-level control integration.

Abstract

Efficient traffic monitoring is crucial for managing urban transportation networks, especially under congested and dynamically changing traffic conditions. Drones offer a scalable and cost-effective alternative to fixed sensor networks. However, deploying fleets of low-cost drones for traffic monitoring poses challenges in adaptability, scalability, and real-time operation. To address these issues, we propose a learning-based framework for decentralized traffic monitoring with drone swarms, targeting the uneven and unpredictable distribution of monitoring needs across urban areas. Our approach introduces a semi-decentralized reinforcement learning model, which trains a single Q-function using the collective experience of the swarm. This model supports full scalability, flexible deployment, and, when hardware allows, the online adaptation of each drone's action-selection mechanism. We first train and evaluate the model in a synthetic traffic environment, followed by a case study using real traffic data from Shenzhen, China, to validate its performance and demonstrate its potential for real-world applications in complex urban monitoring tasks.

On Learning-Based Traffic Monitoring With a Swarm of Drones

TL;DR

This work tackles adaptive urban traffic monitoring with a swarm of low-cost drones by formulating a semi-decentralized reinforcement learning framework that trains a single universal Q-function to govern per-drone decisions. The method combines a grid-based state representation, a temporal-importance transformation, and an idleness-driven patrolling objective to balance exploration and re-observation of high-importance areas, enabling scalable online adaptation. Empirical results in artificial environments show the RL-based swarm outperforms random, greedy, and sweeping strategies in cumulative patrolling and coverage, and a Shenzhen case study demonstrates successful sim-to-real transfer on real traffic data. The approach offers a robust, scalable solution that remains effective under partial coordination and points to future enhancements with recurrent models and richer low-level control integration.

Abstract

Efficient traffic monitoring is crucial for managing urban transportation networks, especially under congested and dynamically changing traffic conditions. Drones offer a scalable and cost-effective alternative to fixed sensor networks. However, deploying fleets of low-cost drones for traffic monitoring poses challenges in adaptability, scalability, and real-time operation. To address these issues, we propose a learning-based framework for decentralized traffic monitoring with drone swarms, targeting the uneven and unpredictable distribution of monitoring needs across urban areas. Our approach introduces a semi-decentralized reinforcement learning model, which trains a single Q-function using the collective experience of the swarm. This model supports full scalability, flexible deployment, and, when hardware allows, the online adaptation of each drone's action-selection mechanism. We first train and evaluate the model in a synthetic traffic environment, followed by a case study using real traffic data from Shenzhen, China, to validate its performance and demonstrate its potential for real-world applications in complex urban monitoring tasks.

Paper Structure

This paper contains 12 sections, 26 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Schematic representation of the city of Shenzhen divided into a 12x15 monitoring grid and covered by a swarm consisting of four drones. Nodes of the traffic network representing the city area are color-coded to illustrate the traffic demand at a particular time instance. Drone $d\in\mathcal{D}$, located in the cell $(3,6)$, has the corresponding $(x,y)$ world coordinates given by $\mathbf{p}_d=[z_{3,6}^x,z_{3,6}^y]^T$.
  • Figure 2: The schematic illustrates the online learning framework for a three-drone swarm. When centralized coordination is active, each iteration samples a batch of data from the combined experiences of all drones to update the shared Q-function parameters which are then used to obtain the drone actions for the current time step.
  • Figure 3: Drone trajectories for various swarm types are shown, with cell colors representing the value $\mathbf{T}^k(\mathbf{p})\mathbf{I}^k(\mathbf{p})$. Brighter cells indicate areas of high unvisited importance, while darker cells correspond to recently visited or less important regions.
  • Figure 4: Performance comparison of different swarm models. For every $k\in\mathbb{Z}_T$, the first plot shows cumulative patrolling score obtained until that moment. The second one presents the evolution of the patrolling score over time, whereas the final one shows the percentage of cells visited from the beginning of simulation.
  • Figure 5: Temporal evolution of drone trajectories in an RL-based swarm, demonstrated through a case study using real traffic data from the city of Shenzhen.

Theorems & Definitions (6)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Remark 1
  • Definition 5