Table of Contents
Fetching ...

Hierarchical Decision-Making under Uncertainty: A Hybrid MDP and Chance-Constrained MPC Approach

Siyuan Li, Chengyuan Liu, Wen-Hua Chen

Abstract

This paper presents a hierarchical decision-making framework for autonomous systems operating under uncertainty, demonstrated through autonomous driving as a representative application. Surrounding agents are modeled using Hybrid Markov Decision Processes (HMDPs) that jointly capture maneuver-level and dynamic-level uncertainties, enabling the multi-modal environmental prediction. The ego agent is modeled using a separate HMDP and integrated into a Model Predictive Control (MPC) framework that unifies maneuver selection with dynamic feasibility within a single optimization. A set of joint chance constraints serves as the bridge between environmental prediction and optimization, incorporating multi-modal environment predictions into the MPC formulation and ensuring safety across all plausible interaction scenarios. The proposed framework provides theoretical guarantees on recursive feasibility and asymptotic stability, and its benefits in terms of safety and efficiency are validated through comprehensive evaluations in highway and urban environments, together with comparisons against a rule-based baseline.

Hierarchical Decision-Making under Uncertainty: A Hybrid MDP and Chance-Constrained MPC Approach

Abstract

This paper presents a hierarchical decision-making framework for autonomous systems operating under uncertainty, demonstrated through autonomous driving as a representative application. Surrounding agents are modeled using Hybrid Markov Decision Processes (HMDPs) that jointly capture maneuver-level and dynamic-level uncertainties, enabling the multi-modal environmental prediction. The ego agent is modeled using a separate HMDP and integrated into a Model Predictive Control (MPC) framework that unifies maneuver selection with dynamic feasibility within a single optimization. A set of joint chance constraints serves as the bridge between environmental prediction and optimization, incorporating multi-modal environment predictions into the MPC formulation and ensuring safety across all plausible interaction scenarios. The proposed framework provides theoretical guarantees on recursive feasibility and asymptotic stability, and its benefits in terms of safety and efficiency are validated through comprehensive evaluations in highway and urban environments, together with comparisons against a rule-based baseline.
Paper Structure (29 sections, 2 theorems, 66 equations, 11 figures, 8 tables)

This paper contains 29 sections, 2 theorems, 66 equations, 11 figures, 8 tables.

Key Result

Theorem 1

Suppose Assumption baseline holds. If at the initial time $k = 0$, the MPC optimization (eq:etermpc) can find an action sequence that satisfies all safety and feasibility requirements within the prediction horizon, then for all time steps $k \geq 1$, it can still find an action sequence that satisfi

Figures (11)

  • Figure 1: Schematic overview of the proposed hierarchical decision-making framework.
  • Figure 2: Illustration of multi-modal prediction and reachability-set construction of a SV. Low-probability branch is pruned, while the remaining hypotheses are propagated to obtain predicted means, covariances, and ellipsoidal reachable regions whose union approximates the reachability set $\mathcal{X}^{sa}$ used in the joint chance constraint (\ref{['eq:mixchanceconstraints']}).
  • Figure 3: Snapshots of the EV (red) at six representative time instants. Each snapshot shows the vehicle state at $t_{k+1}$ resulting from the decision computed at the sampling instant $t_k$ marked in Fig. \ref{['fig:case1_actiontime']}. Subfigures (a)–(f) correspond to 12.2, 21.0, 23.4, 29.8, 30.6, and 41.0 s, respectively.
  • Figure 4: Time histories of the EV’s lateral (top) and longitudinal (bottom) MDP states and actions in Case 1. Yellow circles indicate the sampling instants at which the six decisions are taken, while Fig. \ref{['fig:case1_snapshots']} shows the resulting states at the next sampling instants.
  • Figure 5: Complete EV trajectory in Case 1.
  • ...and 6 more figures

Theorems & Definitions (16)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Remark 7
  • Theorem 1
  • proof
  • Theorem 2
  • ...and 6 more