Table of Contents
Fetching ...

Applying Neural Monte Carlo Tree Search to Unsignalized Multi-intersection Scheduling for Autonomous Vehicles

Yucheng Shi, Wenlong Wang, Xiaowen Tao, Ivana Dusparic, Vinny Cahill

TL;DR

This paper introduces a transformation model that maps successive sequences of potentially conflicting road-space reservation requests from platoons of vehicles into a series of board-game-like problems and uses NMCTS to search for solutions representing optimal road-space allocation schedules in the context of past allocations.

Abstract

Dynamic scheduling of access to shared resources by autonomous systems is a challenging problem, characterized as being NP-hard. The complexity of this task leads to a combinatorial explosion of possibilities in highly dynamic systems where arriving requests must be continuously scheduled subject to strong safety and time constraints. An example of such a system is an unsignalized intersection, where automated vehicles' access to potential conflict zones must be dynamically scheduled. In this paper, we apply Neural Monte Carlo Tree Search (NMCTS) to the challenging task of scheduling platoons of vehicles crossing unsignalized intersections. Crucially, we introduce a transformation model that maps successive sequences of potentially conflicting road-space reservation requests from platoons of vehicles into a series of board-game-like problems and use NMCTS to search for solutions representing optimal road-space allocation schedules in the context of past allocations. To optimize search, we incorporate a prioritized re-sampling method with parallel NMCTS (PNMCTS) to improve the quality of training data. To optimize training, a curriculum learning strategy is used to train the agent to schedule progressively more complex boards culminating in overlapping boards that represent busy intersections. In a busy single four-way unsignalized intersection simulation, PNMCTS solved 95\% of unseen scenarios, reducing crossing time by 43\% in light and 52\% in heavy traffic versus first-in, first-out control. In a 3x3 multi-intersection network, the proposed method maintained free-flow in light traffic when all intersections are under control of PNMCTS and outperformed state-of-the-art RL-based traffic-light controllers in average travel time by 74.5\% and total throughput by 16\% in heavy traffic.

Applying Neural Monte Carlo Tree Search to Unsignalized Multi-intersection Scheduling for Autonomous Vehicles

TL;DR

This paper introduces a transformation model that maps successive sequences of potentially conflicting road-space reservation requests from platoons of vehicles into a series of board-game-like problems and uses NMCTS to search for solutions representing optimal road-space allocation schedules in the context of past allocations.

Abstract

Dynamic scheduling of access to shared resources by autonomous systems is a challenging problem, characterized as being NP-hard. The complexity of this task leads to a combinatorial explosion of possibilities in highly dynamic systems where arriving requests must be continuously scheduled subject to strong safety and time constraints. An example of such a system is an unsignalized intersection, where automated vehicles' access to potential conflict zones must be dynamically scheduled. In this paper, we apply Neural Monte Carlo Tree Search (NMCTS) to the challenging task of scheduling platoons of vehicles crossing unsignalized intersections. Crucially, we introduce a transformation model that maps successive sequences of potentially conflicting road-space reservation requests from platoons of vehicles into a series of board-game-like problems and use NMCTS to search for solutions representing optimal road-space allocation schedules in the context of past allocations. To optimize search, we incorporate a prioritized re-sampling method with parallel NMCTS (PNMCTS) to improve the quality of training data. To optimize training, a curriculum learning strategy is used to train the agent to schedule progressively more complex boards culminating in overlapping boards that represent busy intersections. In a busy single four-way unsignalized intersection simulation, PNMCTS solved 95\% of unseen scenarios, reducing crossing time by 43\% in light and 52\% in heavy traffic versus first-in, first-out control. In a 3x3 multi-intersection network, the proposed method maintained free-flow in light traffic when all intersections are under control of PNMCTS and outperformed state-of-the-art RL-based traffic-light controllers in average travel time by 74.5\% and total throughput by 16\% in heavy traffic.

Paper Structure

This paper contains 22 sections, 6 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: An intersection with four approaching platoons is illustrated on the left. The representative board corresponding to this scenario is given on the right. Entry and exit times for collision areas are shown on the time line (X-axis), while the Y-axis shows platoons numbers. As platoon 1 is moving forward, it will cross collision areas C$\rightarrow$ B $\rightarrow$ A in that order, with the occupancy time of the corresponding collision area sequence mapped to time blocks (shown stacked for each platoon for readability). The occupancy time of C, B, and A collision areas by platoon 1 overlap because the platoon is longer than the gap between the collision areas. We can observe from the illustration that, for example, without a scheduling intervention, platoons 1 and 4 will collide in collision area A, Platoon 3 and 4 will collide in collision area D, as they would occupy it simultaneously.
  • Figure 2: Grid representation of transformation board.
  • Figure 3: Transformation board represents a busy intersection that is executing a previous schedule (the upper board) and cannot be rescheduled. The new approaching platoons have potential to collide the previous schedule on collision area A and B. Thus, the current schedule must ensure a compatible solution to the solved board.
  • Figure 4: An optimal crossing schedule provided by a PNMCTS agent at a busy intersection from top to bottom, platoon delay sequence: initial board $\rightarrow$ platoon1 $\rightarrow$ platoon0 $\rightarrow$ platoon2 $\rightarrow$ platoon0.
  • Figure 5: (a): Performance curve with curriculum training and (b): average board solving time and solution quality distribution.
  • ...and 3 more figures