On Learning-Based Traffic Monitoring With a Swarm of Drones
Marko Maljkovic, Nikolas Geroliminis
TL;DR
This work tackles adaptive urban traffic monitoring with a swarm of low-cost drones by formulating a semi-decentralized reinforcement learning framework that trains a single universal Q-function $Q(\mathbf{s},\mathbf{u};\theta)$ to govern per-drone decisions. The method combines a grid-based state representation, a temporal-importance transformation, and an idleness-driven patrolling objective to balance exploration and re-observation of high-importance areas, enabling scalable online adaptation. Empirical results in artificial environments show the RL-based swarm outperforms random, greedy, and sweeping strategies in cumulative patrolling and coverage, and a Shenzhen case study demonstrates successful sim-to-real transfer on real traffic data. The approach offers a robust, scalable solution that remains effective under partial coordination and points to future enhancements with recurrent models and richer low-level control integration.
Abstract
Efficient traffic monitoring is crucial for managing urban transportation networks, especially under congested and dynamically changing traffic conditions. Drones offer a scalable and cost-effective alternative to fixed sensor networks. However, deploying fleets of low-cost drones for traffic monitoring poses challenges in adaptability, scalability, and real-time operation. To address these issues, we propose a learning-based framework for decentralized traffic monitoring with drone swarms, targeting the uneven and unpredictable distribution of monitoring needs across urban areas. Our approach introduces a semi-decentralized reinforcement learning model, which trains a single Q-function using the collective experience of the swarm. This model supports full scalability, flexible deployment, and, when hardware allows, the online adaptation of each drone's action-selection mechanism. We first train and evaluate the model in a synthetic traffic environment, followed by a case study using real traffic data from Shenzhen, China, to validate its performance and demonstrate its potential for real-world applications in complex urban monitoring tasks.
