Table of Contents
Fetching ...

A Novel MDP Decomposition Framework for Scalable UAV Mission Planning in Complex and Uncertain Environments

Md Muzakkir Quamar, Ali Nasir, Sami ELFerik

TL;DR

The paper tackles scalable UAV mission planning under uncertainty by introducing a two-stage MDP decomposition: first, a factor-based partitioning into goal-specific sub-MDPs, and second, a priority-based recombination with a meta-policy. It proves policy equivalence under mild independence assumptions and demonstrates substantial computational gains, validating the approach with simulations that show near-perfect fidelity to the global policy and large speedups. Key contributions include a fault- and energy-aware MDP formulation, novel decomposition algorithms, and rigorous mapping between sub-MDP and global policies. The work enables real-time, fault-tolerant UAV decision-making in complex environments and provides a foundation for extending scalable decision-making to broader AI planning problems.

Abstract

This paper presents a scalable and fault-tolerant framework for unmanned aerial vehicle (UAV) mission management in complex and uncertain environments. The proposed approach addresses the computational bottleneck inherent in solving large-scale Markov Decision Processes (MDPs) by introducing a two-stage decomposition strategy. In the first stage, a factor-based algorithm partitions the global MDP into smaller, goal-specific sub-MDPs by leveraging domain-specific features such as goal priority, fault states, spatial layout, and energy constraints. In the second stage, a priority-based recombination algorithm solves each sub-MDP independently and integrates the results into a unified global policy using a meta-policy for conflict resolution. Importantly, we present a theoretical analysis showing that, under mild probabilistic independence assumptions, the combined policy is provably equivalent to the optimal global MDP policy. Our work advances artificial intelligence (AI) decision scalability by decomposing large MDPs into tractable subproblems with provable global equivalence. The proposed decomposition framework enhances the scalability of Markov Decision Processes, a cornerstone of sequential decision-making in artificial intelligence, enabling real-time policy updates for complex mission environments. Extensive simulations validate the effectiveness of our method, demonstrating orders-of-magnitude reduction in computation time without sacrificing mission reliability or policy optimality. The proposed framework establishes a practical and robust foundation for scalable decision-making in real-time UAV mission execution.

A Novel MDP Decomposition Framework for Scalable UAV Mission Planning in Complex and Uncertain Environments

TL;DR

The paper tackles scalable UAV mission planning under uncertainty by introducing a two-stage MDP decomposition: first, a factor-based partitioning into goal-specific sub-MDPs, and second, a priority-based recombination with a meta-policy. It proves policy equivalence under mild independence assumptions and demonstrates substantial computational gains, validating the approach with simulations that show near-perfect fidelity to the global policy and large speedups. Key contributions include a fault- and energy-aware MDP formulation, novel decomposition algorithms, and rigorous mapping between sub-MDP and global policies. The work enables real-time, fault-tolerant UAV decision-making in complex environments and provides a foundation for extending scalable decision-making to broader AI planning problems.

Abstract

This paper presents a scalable and fault-tolerant framework for unmanned aerial vehicle (UAV) mission management in complex and uncertain environments. The proposed approach addresses the computational bottleneck inherent in solving large-scale Markov Decision Processes (MDPs) by introducing a two-stage decomposition strategy. In the first stage, a factor-based algorithm partitions the global MDP into smaller, goal-specific sub-MDPs by leveraging domain-specific features such as goal priority, fault states, spatial layout, and energy constraints. In the second stage, a priority-based recombination algorithm solves each sub-MDP independently and integrates the results into a unified global policy using a meta-policy for conflict resolution. Importantly, we present a theoretical analysis showing that, under mild probabilistic independence assumptions, the combined policy is provably equivalent to the optimal global MDP policy. Our work advances artificial intelligence (AI) decision scalability by decomposing large MDPs into tractable subproblems with provable global equivalence. The proposed decomposition framework enhances the scalability of Markov Decision Processes, a cornerstone of sequential decision-making in artificial intelligence, enabling real-time policy updates for complex mission environments. Extensive simulations validate the effectiveness of our method, demonstrating orders-of-magnitude reduction in computation time without sacrificing mission reliability or policy optimality. The proposed framework establishes a practical and robust foundation for scalable decision-making in real-time UAV mission execution.

Paper Structure

This paper contains 27 sections, 2 theorems, 28 equations, 12 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

Policy Equivalence under Probabilistic Independence Consider a global MDP defined as: with state space $\mathcal{S}$, action space $\mathcal{A}$, transition kernel $P$, cost function $J$, and discount factor $\gamma$. Assume the following conditions hold: Under these conditions, solving each sub-MDP individually via value or policy iteration yields local optimal policies $\pi_i^*(s_i)$. Furthermo

Figures (12)

  • Figure 1: Block diagram architecture of the proposed MDP model for single UAV operation.
  • Figure 2: Basic components of an MDP model formulation.
  • Figure 3: Value-iteration convergence time vs. state–space size (log–log), showing measured points and power–law extrapolation.
  • Figure 4: Simplified block representation of decomposed sub-MDP policies combined into a unified global policy.
  • Figure 5: Value Iteration convergence plot
  • ...and 7 more figures

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • Corollary 2