Table of Contents
Fetching ...

Generalized Phase Pressure Control Enhanced Reinforcement Learning for Traffic Signal Control

Xiao-Cheng Liao, Yi Mei, Mengjie Zhang, Xiang-Ling Chen

TL;DR

This work tackles the challenge of designing theoretically grounded traffic state representations for traffic signal control and achieving stable, high-performance policies. It introduces Generalized Phase Pressure (G2P), a pressure-based control framework that accounts for absolute and relative traffic conditions across multi-lane intersections, and extends pressure theory to multi-homogeneous-lane networks. The authors derive a generalized phase pressure, propose a RL template (G2P-XLight) with two variants (G2P-MPLight, G2P-CoLight), and demonstrate substantial gains over state-of-the-art heuristic and learning-based methods on CityFlow real-world datasets. The results indicate improved performance, stability, and data efficiency, with G2P-CoLight showing strong generalization in unseen Manhattan scenarios, and code made available for reproducibility.

Abstract

Appropriate traffic state representation is crucial for learning traffic signal control policies. However, most of the current traffic state representations are heuristically designed, with insufficient theoretical support. In this paper, we (1) develop a flexible, efficient, and theoretically grounded method, namely generalized phase pressure (G2P) control, which takes only simple lane features into consideration to decide which phase to be actuated; 2) extend the pressure control theory to a general form for multi-homogeneous-lane road networks based on queueing theory; (3) design a new traffic state representation based on the generalized phase state features from G2P control; and 4) develop a reinforcement learning (RL)-based algorithm template named G2P-XLight, and two RL algorithms, G2P-MPLight and G2P-CoLight, by combining the generalized phase state representation with MPLight and CoLight, two well-performed RL methods for learning traffic signal control policies. Extensive experiments conducted on multiple real-world datasets demonstrate that G2P control outperforms the state-of-the-art (SOTA) heuristic method in the transportation field and other recent human-designed heuristic methods; and that the newly proposed G2P-XLight significantly outperforms SOTA learning-based approaches. Our code is available online.

Generalized Phase Pressure Control Enhanced Reinforcement Learning for Traffic Signal Control

TL;DR

This work tackles the challenge of designing theoretically grounded traffic state representations for traffic signal control and achieving stable, high-performance policies. It introduces Generalized Phase Pressure (G2P), a pressure-based control framework that accounts for absolute and relative traffic conditions across multi-lane intersections, and extends pressure theory to multi-homogeneous-lane networks. The authors derive a generalized phase pressure, propose a RL template (G2P-XLight) with two variants (G2P-MPLight, G2P-CoLight), and demonstrate substantial gains over state-of-the-art heuristic and learning-based methods on CityFlow real-world datasets. The results indicate improved performance, stability, and data efficiency, with G2P-CoLight showing strong generalization in unseen Manhattan scenarios, and code made available for reproducibility.

Abstract

Appropriate traffic state representation is crucial for learning traffic signal control policies. However, most of the current traffic state representations are heuristically designed, with insufficient theoretical support. In this paper, we (1) develop a flexible, efficient, and theoretically grounded method, namely generalized phase pressure (G2P) control, which takes only simple lane features into consideration to decide which phase to be actuated; 2) extend the pressure control theory to a general form for multi-homogeneous-lane road networks based on queueing theory; (3) design a new traffic state representation based on the generalized phase state features from G2P control; and 4) develop a reinforcement learning (RL)-based algorithm template named G2P-XLight, and two RL algorithms, G2P-MPLight and G2P-CoLight, by combining the generalized phase state representation with MPLight and CoLight, two well-performed RL methods for learning traffic signal control policies. Extensive experiments conducted on multiple real-world datasets demonstrate that G2P control outperforms the state-of-the-art (SOTA) heuristic method in the transportation field and other recent human-designed heuristic methods; and that the newly proposed G2P-XLight significantly outperforms SOTA learning-based approaches. Our code is available online.

Paper Structure

This paper contains 24 sections, 59 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: A multi-homogeneous-lane intersection, where a single road has multiple lanes with the same turning direction, is shown with illustrations of some symbols and concepts as well as an example of calculating the generalized turn movement pressure $\mathbb{P}\{\mathcal{T}^{\rightarrow}\}$.
  • Figure 2: For each algorithm we used 30 seeds and obtained the mean-curve with 95% confidence interval presented as shadowed regions. For clarity, only the average value curves are shown after 125 episodes.
  • Figure 3: The average travel time (the smaller the better) of different algorithms on unseen scenarios.
  • Figure 4: The average queue length (the smaller the better) of different algorithms.
  • Figure 5: Convergence curves of different algorithms in different scenarios. For each algorithm we used 30 seeds and obtained the mean-curve with 95% confidence interval presented as shadowed regions. MPLight-based methods consistently achieve faster convergence efficiency compared to CoLight-based methods. As the scale increases (from Jinan to Hangzhou), the advantages of G2P-CoLight become increasingly evident.