Table of Contents
Fetching ...

SmartPathfinder: Pushing the Limits of Heuristic Solutions for Vehicle Routing Problem with Drones Using Reinforcement Learning

Navid Mohammad Imran, Myounggyu Won

TL;DR

SmartPathfinder tackles the NP-hard Vehicle Routing Problem with Drones (VRPD) by decomposing existing VRPD heuristics into universal components and coupling them with a reinforcement-learning (RL) framework. The approach introduces a policy-guided solution modifier, an evaluative reward loop, and an optional solution-escape mechanism to avoid local optima, with a reward function that balances solution quality and computational time. When applied to a state-of-the-art memetic algorithm, the RL-enhanced method achieves up to 28.4% improvement in solution quality and up to 27.3% faster runtimes, particularly on larger instance sizes, and can handle up to 200 customers in reasonable time. The work provides a generalizable guideline for integrating RL with heuristic VRPD methods and demonstrates meaningful practical gains in drone-assisted logistics.

Abstract

The Vehicle Routing Problem with Drones (VRPD) seeks to optimize the routing paths for both trucks and drones, where the trucks are responsible for delivering parcels to customer locations, and the drones are dispatched from these trucks for parcel delivery, subsequently being retrieved by the trucks. Given the NP-Hard complexity of VRPD, numerous heuristic approaches have been introduced. However, improving solution quality and reducing computation time remain significant challenges. In this paper, we conduct a comprehensive examination of heuristic methods designed for solving VRPD, distilling and standardizing them into core elements. We then develop a novel reinforcement learning (RL) framework that is seamlessly integrated with the heuristic solution components, establishing a set of universal principles for incorporating the RL framework with heuristic strategies in an aim to improve both the solution quality and computation speed. This integration has been applied to a state-of-the-art heuristic solution for VRPD, showcasing the substantial benefits of incorporating the RL framework. Our evaluation results demonstrated that the heuristic solution incorporated with our RL framework not only elevated the quality of solutions but also achieved rapid computation speeds, especially when dealing with extensive customer locations.

SmartPathfinder: Pushing the Limits of Heuristic Solutions for Vehicle Routing Problem with Drones Using Reinforcement Learning

TL;DR

SmartPathfinder tackles the NP-hard Vehicle Routing Problem with Drones (VRPD) by decomposing existing VRPD heuristics into universal components and coupling them with a reinforcement-learning (RL) framework. The approach introduces a policy-guided solution modifier, an evaluative reward loop, and an optional solution-escape mechanism to avoid local optima, with a reward function that balances solution quality and computational time. When applied to a state-of-the-art memetic algorithm, the RL-enhanced method achieves up to 28.4% improvement in solution quality and up to 27.3% faster runtimes, particularly on larger instance sizes, and can handle up to 200 customers in reasonable time. The work provides a generalizable guideline for integrating RL with heuristic VRPD methods and demonstrates meaningful practical gains in drone-assisted logistics.

Abstract

The Vehicle Routing Problem with Drones (VRPD) seeks to optimize the routing paths for both trucks and drones, where the trucks are responsible for delivering parcels to customer locations, and the drones are dispatched from these trucks for parcel delivery, subsequently being retrieved by the trucks. Given the NP-Hard complexity of VRPD, numerous heuristic approaches have been introduced. However, improving solution quality and reducing computation time remain significant challenges. In this paper, we conduct a comprehensive examination of heuristic methods designed for solving VRPD, distilling and standardizing them into core elements. We then develop a novel reinforcement learning (RL) framework that is seamlessly integrated with the heuristic solution components, establishing a set of universal principles for incorporating the RL framework with heuristic strategies in an aim to improve both the solution quality and computation speed. This integration has been applied to a state-of-the-art heuristic solution for VRPD, showcasing the substantial benefits of incorporating the RL framework. Our evaluation results demonstrated that the heuristic solution incorporated with our RL framework not only elevated the quality of solutions but also achieved rapid computation speeds, especially when dealing with extensive customer locations.
Paper Structure (16 sections, 2 equations, 8 figures, 1 table)

This paper contains 16 sections, 2 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: An overview of a truck-drone operation in the vehicle routing problem with drones (VRPD). Trucks and drones collaborate to distribute parcels to customers, with the docking hubs serving as the retrieval points for trucks to collect returning drones.
  • Figure 2: An architecture of SmartPathfinder, illustrating integration with the RL framework with a heuristic algorithm for VRPD.
  • Figure 3: The reward values of the memetic algorithm-based heuristic solution integrated with the RL framework, illustrating the convergence of the reward value.
  • Figure 4: An example solution generated with (a) RL+MA, (b) MA, and (c) NS. The integration of the RL framework with MA results in more efficient paths for both trucks and drones compared with MA and NS.
  • Figure 5: The solution quality with varying numbers of customers. RL+MA significantly improves the solution quality especially for large-scale problems.
  • ...and 3 more figures