Adaptive traffic signal control optimization using a novel road partition and multi-channel state representation method
Maojiang Deng, Shoufeng Lu, Jiazhao Shi, Wen Zhang
TL;DR
This work tackles urban intersection congestion by introducing an adaptive traffic signal control framework that combines a novel road-partitioning scheme with a multi-channel traffic state representation. The road is partitioned using the logarithmic–linear formula $f(x) = a \ln(x+1) + b x$, producing a fixed number of cells that adapt to sensor range and enable cross-intersection transfer, while the state input comprises three channels (vehicle counts, speeds, occupancy) normalized via $x_{inorm} = x_i / x_{maxhist}$. Two DRL algorithms, DQN and PPO, are trained in SUMO–Traci simulations; the reward function blends waiting time, speed, and fuel consumption as $r(t) = -0.7 \cdot waiting_{penalty} + 0.2 \cdot speed_{reward} -0.1 \cdot fuel_{penalty}$ with penalties defined by $waiting_{penalty} = \min(cur_{waiting}/200,1)$, etc. Results show that the VCL-PPO configuration yields the best performance in cumulative queue length and waiting time and demonstrates cross-range transferability across sensor ranges, indicating strong generalization and practical deployment potential with radar–video sensing data.
Abstract
This study proposes a novel adaptive traffic signal control method leveraging a Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) to optimize signal timing by integrating variable cell length and multi-channel state representation. A road partition formula consisting of the sum of logarithmic and linear functions was proposed. The state variables are a vector composed of three channels: the number of vehicles, the average speed, and space occupancy. The set of available signal phases constitutes the action space, the selected phase is executed with a fixed green time. The reward function is formulated using the absolute values of key traffic state metrics - waiting time, speed, and fuel consumption. Each metric is normalized by a typical maximum value and assigned a weight that reflects its priority and optimization direction. The simulation results, using Sumo-TensorFlow-Python, demonstrate a cross-range transferability evaluation and show that the proposed variable cell length and multi-channel state representation method excels compared to fixed cell length in optimization performance.
