Table of Contents
Fetching ...

Optimal Path Planning and Cost Minimization for a Drone Delivery System Via Model Predictive Control

Muhammad Al-Zafar Khan, Jamal Al-Karaki

TL;DR

This work addresses efficient, constraint-aware drone delivery by formulating a multi-agent delivery problem as a Model Predictive Control (MPC) task and benchmarking it against three MARL baselines (IQL, JAL, VDN). The authors develop a receding-horizon MPC framework with horizon length $N$ that minimizes $J = \sum_{i=1}^{n}\sum_{j=1}^{M} c_j \cdot \mathbbm{1}_{ij} + \lambda \cdot n$ under discrete-time dynamics $\mathbf{x}_i(k+1)=\mathbf{A}^{T}\mathbf{x}_i(k)+\mathbf{B}^{T}\mathbf{u}_i(k)$, while enforcing per-building delivery and no-fly airspace constraints. Across two grid-world environments with increasing complexity, MPC consistently achieves faster convergence and requires fewer drones to reach optimality, whereas MARL approaches tend to deliver lower per-buildings costs at the expense of more drones and longer training times. The results highlight MPC’s suitability for real-time, scalable drone delivery under constraints and lay groundwork for integrating advanced MARL techniques in future benchmarking studies.

Abstract

In this study, we formulate the drone delivery problem as a control problem and solve it using Model Predictive Control. Two experiments are performed: The first is on a less challenging grid world environment with lower dimensionality, and the second is with a higher dimensionality and added complexity. The MPC method was benchmarked against three popular Multi-Agent Reinforcement Learning (MARL): Independent $Q$-Learning (IQL), Joint Action Learners (JAL), and Value-Decomposition Networks (VDN). It was shown that the MPC method solved the problem quicker and required fewer optimal numbers of drones to achieve a minimized cost and navigate the optimal path.

Optimal Path Planning and Cost Minimization for a Drone Delivery System Via Model Predictive Control

TL;DR

This work addresses efficient, constraint-aware drone delivery by formulating a multi-agent delivery problem as a Model Predictive Control (MPC) task and benchmarking it against three MARL baselines (IQL, JAL, VDN). The authors develop a receding-horizon MPC framework with horizon length that minimizes under discrete-time dynamics , while enforcing per-building delivery and no-fly airspace constraints. Across two grid-world environments with increasing complexity, MPC consistently achieves faster convergence and requires fewer drones to reach optimality, whereas MARL approaches tend to deliver lower per-buildings costs at the expense of more drones and longer training times. The results highlight MPC’s suitability for real-time, scalable drone delivery under constraints and lay groundwork for integrating advanced MARL techniques in future benchmarking studies.

Abstract

In this study, we formulate the drone delivery problem as a control problem and solve it using Model Predictive Control. Two experiments are performed: The first is on a less challenging grid world environment with lower dimensionality, and the second is with a higher dimensionality and added complexity. The MPC method was benchmarked against three popular Multi-Agent Reinforcement Learning (MARL): Independent -Learning (IQL), Joint Action Learners (JAL), and Value-Decomposition Networks (VDN). It was shown that the MPC method solved the problem quicker and required fewer optimal numbers of drones to achieve a minimized cost and navigate the optimal path.

Paper Structure

This paper contains 7 sections, 6 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: A depiction of a drone delivery system navigating through a grid world environment to deliver packages.
  • Figure 2: Left: Cost function indicating the total cost and delivery cost. Right: An optimal path that traverses all delivery locations.
  • Figure 3: Optimal paths and cost functions for the various MARL algorithms applied to environment 1.
  • Figure 4: Left: Cost function indicating the total cost and delivery cost. Right: A geodesic optimal path traversing all delivery locations.
  • Figure 5: Optimal paths and cost functions for various MARL algorithms applied to environment 2.