Table of Contents
Fetching ...

Adaptive traffic signal control optimization using a novel road partition and multi-channel state representation method

Maojiang Deng, Shoufeng Lu, Jiazhao Shi, Wen Zhang

TL;DR

This work tackles urban intersection congestion by introducing an adaptive traffic signal control framework that combines a novel road-partitioning scheme with a multi-channel traffic state representation. The road is partitioned using the logarithmic–linear formula $f(x) = a \ln(x+1) + b x$, producing a fixed number of cells that adapt to sensor range and enable cross-intersection transfer, while the state input comprises three channels (vehicle counts, speeds, occupancy) normalized via $x_{inorm} = x_i / x_{maxhist}$. Two DRL algorithms, DQN and PPO, are trained in SUMO–Traci simulations; the reward function blends waiting time, speed, and fuel consumption as $r(t) = -0.7 \cdot waiting_{penalty} + 0.2 \cdot speed_{reward} -0.1 \cdot fuel_{penalty}$ with penalties defined by $waiting_{penalty} = \min(cur_{waiting}/200,1)$, etc. Results show that the VCL-PPO configuration yields the best performance in cumulative queue length and waiting time and demonstrates cross-range transferability across sensor ranges, indicating strong generalization and practical deployment potential with radar–video sensing data.

Abstract

This study proposes a novel adaptive traffic signal control method leveraging a Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) to optimize signal timing by integrating variable cell length and multi-channel state representation. A road partition formula consisting of the sum of logarithmic and linear functions was proposed. The state variables are a vector composed of three channels: the number of vehicles, the average speed, and space occupancy. The set of available signal phases constitutes the action space, the selected phase is executed with a fixed green time. The reward function is formulated using the absolute values of key traffic state metrics - waiting time, speed, and fuel consumption. Each metric is normalized by a typical maximum value and assigned a weight that reflects its priority and optimization direction. The simulation results, using Sumo-TensorFlow-Python, demonstrate a cross-range transferability evaluation and show that the proposed variable cell length and multi-channel state representation method excels compared to fixed cell length in optimization performance.

Adaptive traffic signal control optimization using a novel road partition and multi-channel state representation method

TL;DR

This work tackles urban intersection congestion by introducing an adaptive traffic signal control framework that combines a novel road-partitioning scheme with a multi-channel traffic state representation. The road is partitioned using the logarithmic–linear formula , producing a fixed number of cells that adapt to sensor range and enable cross-intersection transfer, while the state input comprises three channels (vehicle counts, speeds, occupancy) normalized via . Two DRL algorithms, DQN and PPO, are trained in SUMO–Traci simulations; the reward function blends waiting time, speed, and fuel consumption as with penalties defined by , etc. Results show that the VCL-PPO configuration yields the best performance in cumulative queue length and waiting time and demonstrates cross-range transferability across sensor ranges, indicating strong generalization and practical deployment potential with radar–video sensing data.

Abstract

This study proposes a novel adaptive traffic signal control method leveraging a Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) to optimize signal timing by integrating variable cell length and multi-channel state representation. A road partition formula consisting of the sum of logarithmic and linear functions was proposed. The state variables are a vector composed of three channels: the number of vehicles, the average speed, and space occupancy. The set of available signal phases constitutes the action space, the selected phase is executed with a fixed green time. The reward function is formulated using the absolute values of key traffic state metrics - waiting time, speed, and fuel consumption. Each metric is normalized by a typical maximum value and assigned a weight that reflects its priority and optimization direction. The simulation results, using Sumo-TensorFlow-Python, demonstrate a cross-range transferability evaluation and show that the proposed variable cell length and multi-channel state representation method excels compared to fixed cell length in optimization performance.
Paper Structure (20 sections, 14 equations, 10 figures, 5 tables, 3 algorithms)

This paper contains 20 sections, 14 equations, 10 figures, 5 tables, 3 algorithms.

Figures (10)

  • Figure 1: The structure of the proposed methodology
  • Figure 2: Eight actions
  • Figure 3: Road network
  • Figure 4: Comparison of queue lengths across four traffic scenarios
  • Figure 5: The comparison of cumulative vehicle queue
  • ...and 5 more figures