Table of Contents
Fetching ...

Learning-Based Tracking Perimeter Control for Two-region Macroscopic Traffic Dynamics

Can Chen, Yunping Huang, Hongwei Zhang, Shimin Wang, Martin Guay, Shu-Chien Hsu, Renxin Zhong

TL;DR

The paper tackles congestion in two-region networks modeled by the Macroscopic Fundamental Diagram (MFD) by introducing a tracking perimeter control framework that follows a time-varying reference trajectory $n^d(t)$ rather than a fixed set-point. It formulates an Optimal Tracking Perimeter Control Problem (OTPCP) and develops a model-free Adaptive Dynamic Programming (ADP) solution via offline policy iteration and integral reinforcement learning, enabling robust performance under uncertain demand. By augmenting the system with a command generator and employing a nonquadratic cost to handle input saturation, the approach computes an optimal tracking policy $oxed{\mu^*(N)=-\lambda\tanh\left(\frac{1}{2\lambda}R^{-1}S(N)^T\nabla V^*(N)\right)}$ without requiring full knowledge of the dynamics. Experiments show substantial performance gains over conventional SPC, including a ~20% reduction in total travel time and a ~3% boost in trip completion under nominal demand, and demonstrated robustness to disturbances with time-varying trajectories. The results validate the practicality of learning-based tracking perimeter control for real-time, large-scale traffic management with uncertain demand patterns.

Abstract

Leveraging the concept of the macroscopic fundamental diagram (MFD), perimeter control can alleviate network-level congestion by identifying critical intersections and regulating them effectively. Considering the time-varying nature of travel demand and the equilibrium of accumulation state, we extend the conventional set-point perimeter control (SPC) problem for the two-region MFD system as an optimal tracking perimeter control problem (OTPCP). Unlike the SPC schemes that stabilize the traffic dynamics to the desired equilibrium point, the proposed tracking perimeter control (TPC) scheme regulates the traffic dynamics to a desired trajectory in a differential framework. Due to the inherent network uncertainties, such as heterogeneity of traffic dynamics and demand disturbance, the system dynamics could be uncertain or even unknown. To address these issues, we propose an adaptive dynamic programming (ADP) approach to solving the OTPCP without utilizing the well-calibrated system dynamics. Numerical experiments demonstrate the effectiveness of the proposed ADP-based TPC. Compared with the SPC scheme, the proposed TPC scheme achieves a 20.01% reduction in total travel time and a 3.15% improvement in cumulative trip completion. Moreover, the proposed adaptive TPC approach can regulate the accumulation state under network uncertainties and demand disturbances to the desired time-varying equilibrium trajectory that aims to maximize the trip completion under a nominal demand pattern. These results validate the robustness of the adaptive TPC approach.

Learning-Based Tracking Perimeter Control for Two-region Macroscopic Traffic Dynamics

TL;DR

The paper tackles congestion in two-region networks modeled by the Macroscopic Fundamental Diagram (MFD) by introducing a tracking perimeter control framework that follows a time-varying reference trajectory rather than a fixed set-point. It formulates an Optimal Tracking Perimeter Control Problem (OTPCP) and develops a model-free Adaptive Dynamic Programming (ADP) solution via offline policy iteration and integral reinforcement learning, enabling robust performance under uncertain demand. By augmenting the system with a command generator and employing a nonquadratic cost to handle input saturation, the approach computes an optimal tracking policy without requiring full knowledge of the dynamics. Experiments show substantial performance gains over conventional SPC, including a ~20% reduction in total travel time and a ~3% boost in trip completion under nominal demand, and demonstrated robustness to disturbances with time-varying trajectories. The results validate the practicality of learning-based tracking perimeter control for real-time, large-scale traffic management with uncertain demand patterns.

Abstract

Leveraging the concept of the macroscopic fundamental diagram (MFD), perimeter control can alleviate network-level congestion by identifying critical intersections and regulating them effectively. Considering the time-varying nature of travel demand and the equilibrium of accumulation state, we extend the conventional set-point perimeter control (SPC) problem for the two-region MFD system as an optimal tracking perimeter control problem (OTPCP). Unlike the SPC schemes that stabilize the traffic dynamics to the desired equilibrium point, the proposed tracking perimeter control (TPC) scheme regulates the traffic dynamics to a desired trajectory in a differential framework. Due to the inherent network uncertainties, such as heterogeneity of traffic dynamics and demand disturbance, the system dynamics could be uncertain or even unknown. To address these issues, we propose an adaptive dynamic programming (ADP) approach to solving the OTPCP without utilizing the well-calibrated system dynamics. Numerical experiments demonstrate the effectiveness of the proposed ADP-based TPC. Compared with the SPC scheme, the proposed TPC scheme achieves a 20.01% reduction in total travel time and a 3.15% improvement in cumulative trip completion. Moreover, the proposed adaptive TPC approach can regulate the accumulation state under network uncertainties and demand disturbances to the desired time-varying equilibrium trajectory that aims to maximize the trip completion under a nominal demand pattern. These results validate the robustness of the adaptive TPC approach.

Paper Structure

This paper contains 9 sections, 1 theorem, 31 equations, 11 figures, 2 tables, 1 algorithm.

Key Result

Proposition 3.1

The model-free ADP algorithm per alg:IRL and the model-based policy iteration method eq8-eq9 give an equivalent solution to the OTPCP.

Figures (11)

  • Figure 1: The two-region MFD system
  • Figure 2: Demand pattern and desired state trajectory
  • Figure 3: The diverse time-varying demand patterns
  • Figure 4: MFDs under demand variations
  • Figure 5: Accumulation state evolutions of Example 1
  • ...and 6 more figures

Theorems & Definitions (2)

  • Proposition 3.1
  • proof