Table of Contents
Fetching ...

CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control

Yifeng Zhang, Harsh Goel, Peizhuo Li, Mehul Damani, Sandeep Chinchali, Guillaume Sartoretti

Abstract

Adaptive traffic signal control (ATSC) is crucial in alleviating congestion, maximizing throughput and promoting sustainable mobility in ever-expanding cities. Multi-Agent Reinforcement Learning (MARL) has recently shown significant potential in addressing complex traffic dynamics, but the intricacies of partial observability and coordination in decentralized environments still remain key challenges in formulating scalable and efficient control strategies. To address these challenges, we present CoordLight, a MARL-based framework designed to improve intra-neighborhood traffic by enhancing decision-making at individual junctions (agents), as well as coordination with neighboring agents, thereby scaling up to network-level traffic optimization. Specifically, we introduce the Queue Dynamic State Encoding (QDSE), a novel state representation based on vehicle queuing models, which strengthens the agents' capability to analyze, predict, and respond to local traffic dynamics. We further propose an advanced MARL algorithm, named Neighbor-aware Policy Optimization (NAPO). It integrates an attention mechanism that discerns the state and action dependencies among adjacent agents, aiming to facilitate more coordinated decision-making, and to improve policy learning updates through robust advantage calculation. This enables agents to identify and prioritize crucial interactions with influential neighbors, thus enhancing the targeted coordination and collaboration among agents. Through comprehensive evaluations against state-of-the-art traffic signal control methods over three real-world traffic datasets composed of up to 196 intersections, we empirically show that CoordLight consistently exhibits superior performance across diverse traffic networks with varying traffic flows. The code is available at https://github.com/marmotlab/CoordLight

CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control

Abstract

Adaptive traffic signal control (ATSC) is crucial in alleviating congestion, maximizing throughput and promoting sustainable mobility in ever-expanding cities. Multi-Agent Reinforcement Learning (MARL) has recently shown significant potential in addressing complex traffic dynamics, but the intricacies of partial observability and coordination in decentralized environments still remain key challenges in formulating scalable and efficient control strategies. To address these challenges, we present CoordLight, a MARL-based framework designed to improve intra-neighborhood traffic by enhancing decision-making at individual junctions (agents), as well as coordination with neighboring agents, thereby scaling up to network-level traffic optimization. Specifically, we introduce the Queue Dynamic State Encoding (QDSE), a novel state representation based on vehicle queuing models, which strengthens the agents' capability to analyze, predict, and respond to local traffic dynamics. We further propose an advanced MARL algorithm, named Neighbor-aware Policy Optimization (NAPO). It integrates an attention mechanism that discerns the state and action dependencies among adjacent agents, aiming to facilitate more coordinated decision-making, and to improve policy learning updates through robust advantage calculation. This enables agents to identify and prioritize crucial interactions with influential neighbors, thus enhancing the targeted coordination and collaboration among agents. Through comprehensive evaluations against state-of-the-art traffic signal control methods over three real-world traffic datasets composed of up to 196 intersections, we empirically show that CoordLight consistently exhibits superior performance across diverse traffic networks with varying traffic flows. The code is available at https://github.com/marmotlab/CoordLight

Paper Structure

This paper contains 29 sections, 16 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Overall learning framework of CoordLight, which introduces a novel state definition, Queueing Dynamic State Encoding (QDSE) for fine-grained traffic representation, and a Neighbor-aware Policy Optimization (NAPO) algorithm to learn efficient coordination strategies among neighboring agents.
  • Figure 2: An illustration of a single intersection with eight traffic signal phases, where the phase $\textit{NS-Left}$ is currently activated, allowing the vehicles along the specified traffic movements (green dotted lines) to traverse the intersection.
  • Figure 3: An illustration of the proposed state representation QDSE for an incoming lane $l$ of a single intersection, where $Q^l(t)=3$, $N^l_{in}(t)=1$, $N^l_{out}(t)=0$, $N^l_{fr}(t)=3$, $N^l_{r}(t)=6$, and $D^l_{fr}(t)=15$ in this case.
  • Figure 4: Detailed structure of our neighbor-aware actor-critic network in CoordLight: Fig. (a) shows the overall structure of our attention-based, spatio-temporal actor network (STN). This network consists of a spatial aggregation unit and a temporal aggregation unit to process information from the agent's neighborhood for policy output. Fig.(b) illustrates the structure of the privileged local critic network, which includes a state encoder and a state-action decoder that captures crucial interactions and incorporates neighbors' state-action dependencies into the value estimation.
  • Figure 5: Real-world traffic road networks of the CityFlow simulation environments used in our experimental results: Jinan, China map (left, $3 \times 4$ intersections), Hangzhou, China map (center, $4 \times 4$ intersections), and New York, USA map (right, $7 \times 28$ intersections).
  • ...and 3 more figures