Table of Contents
Fetching ...

iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement

Aoyu Pang, Maonan Wang, Man-On Pun, Chung Shue Chen, Xi Xiong

TL;DR

This paper tackles robust traffic signal control (TSC) under imperfect observations and long-tail events. It proposes iLLM-TSC, a hybrid framework that uses reinforcement learning (RL) to generate initial TSC decisions and a large language model (LLM) to review and adjust those decisions based on broader context such as emergency vehicles or degraded communications. The framework is designed to be seamlessly integrated with existing RL-based TSC systems without changes to their architectures. Extensive experiments on SUMO show that iLLM-TSC significantly reduces mean waiting times under degraded communication, demonstrating improved robustness and practicality for intelligent transportation systems.

Abstract

Urban congestion remains a critical challenge, with traffic signal control (TSC) emerging as a potent solution. TSC is often modeled as a Markov Decision Process problem and then solved using reinforcement learning (RL), which has proven effective. However, the existing RL-based TSC system often overlooks imperfect observations caused by degraded communication, such as packet loss, delays, and noise, as well as rare real-life events not included in the reward function, such as unconsidered emergency vehicles. To address these limitations, we introduce a novel integration framework that combines a large language model (LLM) with RL. This framework is designed to manage overlooked elements in the reward function and gaps in state information, thereby enhancing the policies of RL agents. In our approach, RL initially makes decisions based on observed data. Subsequently, LLMs evaluate these decisions to verify their reasonableness. If a decision is found to be unreasonable, it is adjusted accordingly. Additionally, this integration approach can be seamlessly integrated with existing RL-based TSC systems without necessitating modifications. Extensive testing confirms that our approach reduces the average waiting time by $17.5\%$ in degraded communication conditions as compared to traditional RL methods, underscoring its potential to advance practical RL applications in intelligent transportation systems. The related code can be found at \url{https://github.com/Traffic-Alpha/iLLM-TSC}.

iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement

TL;DR

This paper tackles robust traffic signal control (TSC) under imperfect observations and long-tail events. It proposes iLLM-TSC, a hybrid framework that uses reinforcement learning (RL) to generate initial TSC decisions and a large language model (LLM) to review and adjust those decisions based on broader context such as emergency vehicles or degraded communications. The framework is designed to be seamlessly integrated with existing RL-based TSC systems without changes to their architectures. Extensive experiments on SUMO show that iLLM-TSC significantly reduces mean waiting times under degraded communication, demonstrating improved robustness and practicality for intelligent transportation systems.

Abstract

Urban congestion remains a critical challenge, with traffic signal control (TSC) emerging as a potent solution. TSC is often modeled as a Markov Decision Process problem and then solved using reinforcement learning (RL), which has proven effective. However, the existing RL-based TSC system often overlooks imperfect observations caused by degraded communication, such as packet loss, delays, and noise, as well as rare real-life events not included in the reward function, such as unconsidered emergency vehicles. To address these limitations, we introduce a novel integration framework that combines a large language model (LLM) with RL. This framework is designed to manage overlooked elements in the reward function and gaps in state information, thereby enhancing the policies of RL agents. In our approach, RL initially makes decisions based on observed data. Subsequently, LLMs evaluate these decisions to verify their reasonableness. If a decision is found to be unreasonable, it is adjusted accordingly. Additionally, this integration approach can be seamlessly integrated with existing RL-based TSC systems without necessitating modifications. Extensive testing confirms that our approach reduces the average waiting time by in degraded communication conditions as compared to traditional RL methods, underscoring its potential to advance practical RL applications in intelligent transportation systems. The related code can be found at \url{https://github.com/Traffic-Alpha/iLLM-TSC}.
Paper Structure (24 sections, 8 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 24 sections, 8 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Challenges in real-world TSC systems: Degraded communication and long-tail scenarios impacting RL decision-making.
  • Figure 2: Illustration of the four phases of traffic signals in a standard 4-way intersection considered in this work.
  • Figure 3: The detailed structure of the proposed iLLM-TSC system.
  • Figure 4: The schematic diagram of LLM as a traffic assistant, illustrates the three-stage architecture of LLM in traffic control: formulating thought programs, acquiring environmental information, and optimizing RL actions.
  • Figure 5: Comparison of the relative performance of different LLMs Using different level prompts. (a) Normal Vehicles. (b) Emergency Vehicles.
  • ...and 4 more figures