MTLight: Efficient Multi-Task Reinforcement Learning for Traffic Signal Control
Liwen Zhu, Peixi Peng, Zongqing Lu, Yonghong Tian
TL;DR
MTLight tackles the challenge of observation richness and sample efficiency in multi-agent traffic signal control by learning a hierarchical latent state via a Multi-Task network. The latent state comprises a task-shared representation and task-specific representations derived from multiple auxiliary traffic predictions, which condition the policy for improved decision making. Empirical results on CityFlow across several cities and traffic patterns show faster convergence, better asymptotic performance, and robust adaptability to peak-hour flows, outperforming both conventional methods and existing RL baselines. The work highlights the value of leveraging related auxiliary tasks as priors to guide learning in complex, interconnected traffic networks, with potential extensions through imitation learning and pretraining of the latent module.
Abstract
Traffic signal control has a great impact on alleviating traffic congestion in modern cities. Deep reinforcement learning (RL) has been widely used for this task in recent years, demonstrating promising performance but also facing many challenges such as limited performances and sample inefficiency. To handle these challenges, MTLight is proposed to enhance the agent observation with a latent state, which is learned from numerous traffic indicators. Meanwhile, multiple auxiliary and supervisory tasks are constructed to learn the latent state, and two types of embedding latent features, the task-specific feature and task-shared feature, are used to make the latent state more abundant. Extensive experiments conducted on CityFlow demonstrate that MTLight has leading convergence speed and asymptotic performance. We further simulate under peak-hour pattern in all scenarios with increasing control difficulty and the results indicate that MTLight is highly adaptable.
