Table of Contents
Fetching ...

Communication Strategy on Macro-and-Micro Traffic State in Cooperative Deep Reinforcement Learning for Regional Traffic Signal Control

Hankang Gu, Shangbo Wang, Dongyao Jia, Yuli Zhang, Yanrong Luo, Guoqiang Mao, Jianping Wang, Eng Gee Lim

TL;DR

This work addresses scalable, cooperative regional traffic signal control (RTSC) under MADRL, where non-stationarity and coordination challenges hinder performance. It provides a formal Markovian justification of RTSC dynamics using a store-and-forward queueing model and introduces two GAT-based communication modules, GA2-Naive and GA2-Aug, to capture micro lane-level and macro intersection-level correlations. The GA2 modules are integrated with two RTSC frameworks, RegionLight and Regional-DRL, and evaluated on real and synthetic grid networks, yielding consistent improvements in average travel time and robust performance across hyperparameters. The approach offers a principled, scalable pathway to enhance MADRL-based RTSC through centralized macro/micro state sharing, with implications for deploying cooperative traffic control in large urban networks.

Abstract

Adaptive Traffic Signal Control (ATSC) has become a popular research topic in intelligent transportation systems. Regional Traffic Signal Control (RTSC) using the Multi-agent Deep Reinforcement Learning (MADRL) technique has become a promising approach for ATSC due to its ability to achieve the optimum trade-off between scalability and optimality. Most existing RTSC approaches partition a traffic network into several disjoint regions, followed by applying centralized reinforcement learning techniques to each region. However, the pursuit of cooperation among RTSC agents still remains an open issue and no communication strategy for RTSC agents has been investigated. In this paper, we propose communication strategies to capture the correlation of micro-traffic states among lanes and the correlation of macro-traffic states among intersections. We first justify the evolution equation of the RTSC process is Markovian via a system of store-and-forward queues. Next, based on the evolution equation, we propose two GAT-Aggregated (GA2) communication modules--GA2-Naive and GA2-Aug to extract both intra-region and inter-region correlations between macro and micro traffic states. While GA2-Naive only considers the movements at each intersection, GA2-Aug also considers the lane-changing behavior of vehicles. Two proposed communication modules are then aggregated into two existing novel RTSC frameworks--RegionLight and Regional-DRL. Experimental results demonstrate that both GA2-Naive and GA2-Aug effectively improve the performance of existing RTSC frameworks under both real and synthetic scenarios. Hyperparameter testing also reveals the robustness and potential of our communication modules in large-scale traffic networks.

Communication Strategy on Macro-and-Micro Traffic State in Cooperative Deep Reinforcement Learning for Regional Traffic Signal Control

TL;DR

This work addresses scalable, cooperative regional traffic signal control (RTSC) under MADRL, where non-stationarity and coordination challenges hinder performance. It provides a formal Markovian justification of RTSC dynamics using a store-and-forward queueing model and introduces two GAT-based communication modules, GA2-Naive and GA2-Aug, to capture micro lane-level and macro intersection-level correlations. The GA2 modules are integrated with two RTSC frameworks, RegionLight and Regional-DRL, and evaluated on real and synthetic grid networks, yielding consistent improvements in average travel time and robust performance across hyperparameters. The approach offers a principled, scalable pathway to enhance MADRL-based RTSC through centralized macro/micro state sharing, with implications for deploying cooperative traffic control in large urban networks.

Abstract

Adaptive Traffic Signal Control (ATSC) has become a popular research topic in intelligent transportation systems. Regional Traffic Signal Control (RTSC) using the Multi-agent Deep Reinforcement Learning (MADRL) technique has become a promising approach for ATSC due to its ability to achieve the optimum trade-off between scalability and optimality. Most existing RTSC approaches partition a traffic network into several disjoint regions, followed by applying centralized reinforcement learning techniques to each region. However, the pursuit of cooperation among RTSC agents still remains an open issue and no communication strategy for RTSC agents has been investigated. In this paper, we propose communication strategies to capture the correlation of micro-traffic states among lanes and the correlation of macro-traffic states among intersections. We first justify the evolution equation of the RTSC process is Markovian via a system of store-and-forward queues. Next, based on the evolution equation, we propose two GAT-Aggregated (GA2) communication modules--GA2-Naive and GA2-Aug to extract both intra-region and inter-region correlations between macro and micro traffic states. While GA2-Naive only considers the movements at each intersection, GA2-Aug also considers the lane-changing behavior of vehicles. Two proposed communication modules are then aggregated into two existing novel RTSC frameworks--RegionLight and Regional-DRL. Experimental results demonstrate that both GA2-Naive and GA2-Aug effectively improve the performance of existing RTSC frameworks under both real and synthetic scenarios. Hyperparameter testing also reveals the robustness and potential of our communication modules in large-scale traffic networks.

Paper Structure

This paper contains 29 sections, 33 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Isolated Intersection and Phase Configuration
  • Figure 2: An arbitrary traffic network with external intersections (cycles with dash line) and internal intersections (cycles with solid line). Intersections of the current region $\mathcal{W}$ are filled with yellow and intersections outside the current region are filled with gray. Intersections that are not adjacent to $\mathcal{W}$ are omitted for simplicity. Red arrows in (b), (c), and (d) describe specific interactions in three scenarios.
  • Figure 3: The architecture of proposed information sharing module. Macro and micro state inputs of the whole traffic network are first centralizedly embedded by stacked GATs and two stacks of GATs do not share weights. Then, the macro and micro hidden features are regrouped and concatenated as the observation of each regional agent according to the region configurations respectively. Next, each decentralized agent predicts the best action based on its observations. Finally, the union of the best actions of all agents is the optimal global control strategy for the whole traffic network.
  • Figure 4: Overall Framework. Our framework contains three components, simulator, agent and memory. The simulator will simulate the traffic environment and offer traffic states for agents. Then agents will make decisions based on traffic states through three stages. First, the centralized communication stage will process network-level traffic state to enable RTSC agents to share information. Then, hidden features will be regrouped and flattened for each region in the feature regrouping stage. Finally, in the decentralized control stage, RTSC agents will choose the actions for their regions in a decentralized manner. The memory component will store the recent transition tuples for future training.
  • Figure 5: Example of Lane Segmentation. Suppose we have an incoming road with three lanes and each lane is segmented into three cells. Then the lane level state of these incoming lanes is $\{[1,3,2],[2,2,0],[2,0,2]\}$.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Definition 1: Movement Matrix
  • Definition 2: Routing Proportion Matrix
  • Definition 3: Blockage Matrix
  • Remark 1