Table of Contents
Fetching ...

An Extended Horizon Tactical Decision-Making for Automated Driving Based on Monte Carlo Tree Search

Karim Essalmi, Fernando Garrido, Fawzi Nashashibi

TL;DR

The paper addresses long-horizon decision-making for automated driving where fixed planning horizons limit safety and efficiency, proposing COR-MCTS to extend horizon planning. It integrates Monte Carlo Tree Search with the COR-MP utility-based maneuver planner, using nodes that store $v$, $m$, $U$, and $UCB$, and an action set that combines lateral and longitudinal maneuvers. Through Selection, Expansion, Simulation, and Backpropagation steps, the method evaluates maneuver sequences and selects the action maximizing accumulated value under a budget of iterations, with $\gamma=0.9$ and $c=\sqrt{2}$. Simulations in two scenarios show COR-MCTS avoids unsafe decisions that plague fixed-horizon planners and achieves real-time performance, especially when pruning is used (median runtimes: $51.9$ ms for COR-MP, $113.88$ ms for pruned COR-MCTS, and $195.56$ ms for unpruned COR-MCTS). The work advances human-like, long-horizon tactical decision-making for autonomous driving and highlights trade-offs between horizon length, computation, and uncertainty, with plans for real-vehicle validation and interaction-aware extensions.

Abstract

This paper introduces COR-MCTS (Conservation of Resources - Monte Carlo Tree Search), a novel tactical decision-making approach for automated driving focusing on maneuver planning over extended horizons. Traditional decision-making algorithms are often constrained by fixed planning horizons, typically up to 6 seconds for classical approaches and 3 seconds for learning-based methods limiting their adaptability in particular dynamic driving scenarios. However, planning must be done well in advance in environments such as highways, roundabouts, and exits to ensure safe and efficient maneuvers. To address this challenge, we propose a hybrid method integrating Monte Carlo Tree Search (MCTS) with our prior utility-based framework, COR-MP (Conservation of Resources Model for Maneuver Planning). This combination enables long-term, real-time decision-making, significantly enhancing the ability to plan a sequence of maneuvers over extended horizons. Through simulations across diverse driving scenarios, we demonstrate that COR-MCTS effectively improves planning robustness and decision efficiency over extended horizons.

An Extended Horizon Tactical Decision-Making for Automated Driving Based on Monte Carlo Tree Search

TL;DR

The paper addresses long-horizon decision-making for automated driving where fixed planning horizons limit safety and efficiency, proposing COR-MCTS to extend horizon planning. It integrates Monte Carlo Tree Search with the COR-MP utility-based maneuver planner, using nodes that store , , , and , and an action set that combines lateral and longitudinal maneuvers. Through Selection, Expansion, Simulation, and Backpropagation steps, the method evaluates maneuver sequences and selects the action maximizing accumulated value under a budget of iterations, with and . Simulations in two scenarios show COR-MCTS avoids unsafe decisions that plague fixed-horizon planners and achieves real-time performance, especially when pruning is used (median runtimes: ms for COR-MP, ms for pruned COR-MCTS, and ms for unpruned COR-MCTS). The work advances human-like, long-horizon tactical decision-making for autonomous driving and highlights trade-offs between horizon length, computation, and uncertainty, with plans for real-vehicle validation and interaction-aware extensions.

Abstract

This paper introduces COR-MCTS (Conservation of Resources - Monte Carlo Tree Search), a novel tactical decision-making approach for automated driving focusing on maneuver planning over extended horizons. Traditional decision-making algorithms are often constrained by fixed planning horizons, typically up to 6 seconds for classical approaches and 3 seconds for learning-based methods limiting their adaptability in particular dynamic driving scenarios. However, planning must be done well in advance in environments such as highways, roundabouts, and exits to ensure safe and efficient maneuvers. To address this challenge, we propose a hybrid method integrating Monte Carlo Tree Search (MCTS) with our prior utility-based framework, COR-MP (Conservation of Resources Model for Maneuver Planning). This combination enables long-term, real-time decision-making, significantly enhancing the ability to plan a sequence of maneuvers over extended horizons. Through simulations across diverse driving scenarios, we demonstrate that COR-MCTS effectively improves planning robustness and decision efficiency over extended horizons.

Paper Structure

This paper contains 10 sections, 4 equations, 5 figures.

Figures (5)

  • Figure 1: Examples of scenarios where a fixed planning horizon method might provide inappropriate outcomes. $a_i$ represents the different possible actions: $a_0$: Keep Lane action, $a_1$: Change Lane Right action, $a_2$: Change Lane Left action; $t_0$ is the current time; $\Delta t$ is the planning horizon. (a) End of lane: in this case, it is not advisable for the ego to move to the left lane, even though the vehicle ahead (OV) is driving slowly; otherwise, the EV could risk getting stuck in the left lane. (b): Exit-ramp: the ego should not overtake the vehicle, otherwise it will skip the exit lane. (c): Intersection: if the ego needs to turn to the left, it should not go back to the right lane otherwise it will be stuck and fail in its mission.
  • Figure 2: Example of a tree structure illustrating how a state is defined. $r_0$: root node, representing the current state of the system; $a_i$: action $i$; $s_i$: state $i$.
  • Figure 3: Steps of our method, where orange edges represent a change lane left action and green ones represent a keeping lane action. For each predicted state $s_k$, our method assigns a $v$ value reflecting the benefit of being in $s_k$. $v$ is then backpropagates until the root node, hence short-term decisions are influenced by long-term ones. These steps are repeated X times until a suitable solution is found.
  • Figure 4: Various results obtained by testing our approach. Subfigures (a) and (b) illustrate the two scenarios presented in this paper. Subfigures (c) and (d) highlight the behavior adopted by the ego-vehicle by running the two presented scenarios for two different algorithms. Subfigure (e) shows the repartition of the runtimes of our previous approach (COR-MP), and the ones of the method presented in this study: COR-MCTS with (w/) pruning and COR-MCTS without (w/o) pruning. These runtime values have been obtained through various simulated scenarios.
  • Figure 5: Diverse screenshots captured while running the two scenarios presented in this study using our simulator. In each screenshot, the current ego vehicle speed is displayed on the gauge in the top-left corner, while the color label represents the planning time for the displayed ego trajectories. The top-right viewer in each subfigure shows the feasible maneuvers, their assigned values $v$, and the maneuver selected by the maneuver planner, which is highlighted with a black square. Subfigures (a) and (c) depict moments where the classical method produces inappropriate outcomes for Scenario 1 and Scenario 2, respectively. In contrast, subfigures (b) and (d) illustrate these situations using an extended planning horizon method, demonstrating its improved decision-making.