Table of Contents
Fetching ...

CycLight: learning traffic signal cooperation with a cycle-level strategy

Gengyue Han, Xiaohan Liu, Xianyue Peng, Hao Wang, Yu Han

TL;DR

CycLight tackles NATSC by introducing a cycle-level MARL framework that jointly optimizes cycle length and phase splits using a discrete-continuous PDQN approach. The method employs PAMDP with a decentralized, attention-augmented architecture to coordinate across intersections, leveraging time-series cycle-state representations and a reward that balances waiting time against secondary queues. Empirical results in SUMO on a 5×5 grid show CycLight outperforms state-of-the-art baselines in waiting-time reduction and throughput, with added robustness to information transmission delays demonstrated by an advance-control variant. The work provides a scalable, practical cycle-level control solution for NATSC and highlights the value of attention in multi-agent coordination for urban traffic networks.

Abstract

This study introduces CycLight, a novel cycle-level deep reinforcement learning (RL) approach for network-level adaptive traffic signal control (NATSC) systems. Unlike most traditional RL-based traffic controllers that focus on step-by-step decision making, CycLight adopts a cycle-level strategy, optimizing cycle length and splits simultaneously using Parameterized Deep Q-Networks (PDQN) algorithm. This cycle-level approach effectively reduces the computational burden associated with frequent data communication, meanwhile enhancing the practicality and safety of real-world applications. A decentralized framework is formulated for multi-agent cooperation, while attention mechanism is integrated to accurately assess the impact of the surroundings on the current intersection. CycLight is tested in a large synthetic traffic grid using the microscopic traffic simulation tool, SUMO. Experimental results not only demonstrate the superiority of CycLight over other state-of-the-art approaches but also showcase its robustness against information transmission delays.

CycLight: learning traffic signal cooperation with a cycle-level strategy

TL;DR

CycLight tackles NATSC by introducing a cycle-level MARL framework that jointly optimizes cycle length and phase splits using a discrete-continuous PDQN approach. The method employs PAMDP with a decentralized, attention-augmented architecture to coordinate across intersections, leveraging time-series cycle-state representations and a reward that balances waiting time against secondary queues. Empirical results in SUMO on a 5×5 grid show CycLight outperforms state-of-the-art baselines in waiting-time reduction and throughput, with added robustness to information transmission delays demonstrated by an advance-control variant. The work provides a scalable, practical cycle-level control solution for NATSC and highlights the value of attention in multi-agent coordination for urban traffic networks.

Abstract

This study introduces CycLight, a novel cycle-level deep reinforcement learning (RL) approach for network-level adaptive traffic signal control (NATSC) systems. Unlike most traditional RL-based traffic controllers that focus on step-by-step decision making, CycLight adopts a cycle-level strategy, optimizing cycle length and splits simultaneously using Parameterized Deep Q-Networks (PDQN) algorithm. This cycle-level approach effectively reduces the computational burden associated with frequent data communication, meanwhile enhancing the practicality and safety of real-world applications. A decentralized framework is formulated for multi-agent cooperation, while attention mechanism is integrated to accurately assess the impact of the surroundings on the current intersection. CycLight is tested in a large synthetic traffic grid using the microscopic traffic simulation tool, SUMO. Experimental results not only demonstrate the superiority of CycLight over other state-of-the-art approaches but also showcase its robustness against information transmission delays.
Paper Structure (23 sections, 18 equations, 10 figures, 3 tables)

This paper contains 23 sections, 18 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Illustration of a cycle-level TSC learning with discrete-continuous hybrid actions. The cycle length is decided by discrete action, while the splits are represented as continuous parameters.
  • Figure 2: The process of observed states gathering. (a) is an example of RL-based ATSC at a single intersection. The lane area sensors detect the vehicle counts on each approach and exit at the end of Phase III, and store them as $\bar{c}_p^{\rm{a}}\left( \kappa \right)$ and $\bar{c}_p^{\rm{e}}\left( \kappa \right)$, respectively. (b) shows the formulation of time-series observations, which are gathered throughout a complete cycle.
  • Figure 3: PDQN agents with decentralized setting. The current intersection attaches more attention to important neighbors during information sharing process.
  • Figure 4: Detailed structure of CycLight.
  • Figure 5: The synthetic traffic gird test-bed.
  • ...and 5 more figures