Deep Reinforcement Learning for Drone Route Optimization in Post-Disaster Road Assessment
Huatian Gong, Jiuh-Biing Sheu, Zheng Wang, Xiaoguang Yang, Ran Yan
TL;DR
The paper tackles the need for rapid drone-based road damage assessment after disasters by introducing AEDM, an attention-based encoder-decoder trained with multi-task reinforcement learning (POMO) to output high-quality drone routes within seconds. It pairs a simple network transformation that converts link-based routing into a node-based formulation with synthetic road-network generation to overcome data scarcity, and a reward-normalized multi-task training regimen to handle varying drone counts and time limits. Empirical results show AEDM significantly outperforms both commercial solvers and traditional heuristics in solution quality while maintaining 1–2 second inference, and demonstrates robust generalization to unseen problem scales and real-world networks (e.g., Anaheim). The work reduces reliance on domain-specific algorithm design, accelerating deployment in time-critical humanitarian response scenarios and providing a framework that can be extended to more complex constraints in future disaster contexts.
Abstract
Rapid post-disaster road damage assessment is critical for effective emergency response, yet traditional optimization methods suffer from excessive computational time and require domain knowledge for algorithm design, making them unsuitable for time-sensitive disaster scenarios. This study proposes an attention-based encoder-decoder model (AEDM) for rapid drone routing decision in post-disaster road damage assessment. The method employs deep reinforcement learning to determine high-quality drone assessment routes without requiring algorithmic design knowledge. A network transformation method is developed to convert link-based routing problems into equivalent node-based formulations, while a synthetic road network generation technique addresses the scarcity of large-scale training datasets. The model is trained using policy optimization with multiple optima (POMO) with multi-task learning capabilities to handle diverse parameter combinations. Experimental results demonstrate two key strengths of AEDM: it outperforms commercial solvers by 20--71\% and traditional heuristics by 23--35\% in solution quality, while achieving rapid inference (1--2 seconds) versus 100--2,000 seconds for traditional methods. The model exhibits strong generalization across varying problem scales, drone numbers, and time constraints, consistently outperforming baseline methods on unseen parameter distributions and real-world road networks. The proposed method effectively balances computational efficiency with solution quality, making it particularly suitable for time-critical disaster response applications where rapid decision-making is essential for saving lives. The source code for AEDM is publicly available at https://github.com/PJ-HTU/AEDM-for-Post-disaster-road-assessment.
