Table of Contents
Fetching ...

LATS: Large Language Model Assisted Teacher-Student Framework for Multi-Agent Reinforcement Learning in Traffic Signal Control

Yifeng Zhang, Peizhuo Li, Tingguang Zhou, Mingfeng Fan, Guillaume Sartoretti

Abstract

Adaptive Traffic Signal Control (ATSC) aims to optimize traffic flow and minimize delays by adjusting traffic lights in real time. Recent advances in Multi-agent Reinforcement Learning (MARL) have shown promise for ATSC, yet existing approaches still suffer from limited representational capacity, often leading to suboptimal performance and poor generalization in complex and dynamic traffic environments. On the other hand, Large Language Models (LLMs) excel at semantic representation, reasoning, and analysis, yet their propensity for hallucination and slow inference speeds often hinder their direct application to decision-making tasks. To address these challenges, we propose a novel learning paradigm named LATS that integrates LLMs and MARL, leveraging the former's strong prior knowledge and inductive abilities to enhance the latter's decision-making process. Specifically, we introduce a plug-and-play teacher-student learning module, where a trained embedding LLM serves as a teacher to generate rich semantic features that capture each intersection's topology structures and traffic dynamics. A much simpler (student) neural network then learns to emulate these features through knowledge distillation in the latent space, enabling the final model to operate independently from the LLM for downstream use in the RL decision-making process. This integration significantly enhances the overall model's representational capacity across diverse traffic scenarios, thus leading to more efficient and generalizable control strategies. Extensive experiments across diverse traffic datasets empirically demonstrate that our method enhances the representation learning capability of RL models, thereby leading to improved overall performance and generalization over both traditional RL and LLM-only approaches. [...]

LATS: Large Language Model Assisted Teacher-Student Framework for Multi-Agent Reinforcement Learning in Traffic Signal Control

Abstract

Adaptive Traffic Signal Control (ATSC) aims to optimize traffic flow and minimize delays by adjusting traffic lights in real time. Recent advances in Multi-agent Reinforcement Learning (MARL) have shown promise for ATSC, yet existing approaches still suffer from limited representational capacity, often leading to suboptimal performance and poor generalization in complex and dynamic traffic environments. On the other hand, Large Language Models (LLMs) excel at semantic representation, reasoning, and analysis, yet their propensity for hallucination and slow inference speeds often hinder their direct application to decision-making tasks. To address these challenges, we propose a novel learning paradigm named LATS that integrates LLMs and MARL, leveraging the former's strong prior knowledge and inductive abilities to enhance the latter's decision-making process. Specifically, we introduce a plug-and-play teacher-student learning module, where a trained embedding LLM serves as a teacher to generate rich semantic features that capture each intersection's topology structures and traffic dynamics. A much simpler (student) neural network then learns to emulate these features through knowledge distillation in the latent space, enabling the final model to operate independently from the LLM for downstream use in the RL decision-making process. This integration significantly enhances the overall model's representational capacity across diverse traffic scenarios, thus leading to more efficient and generalizable control strategies. Extensive experiments across diverse traffic datasets empirically demonstrate that our method enhances the representation learning capability of RL models, thereby leading to improved overall performance and generalization over both traditional RL and LLM-only approaches. [...]

Paper Structure

This paper contains 26 sections, 22 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: The pipeline of our proposed LLM-assisted TSC framework, where we employ LLMs to generate rich semantic features based on customized traffic prompts to enhance the downstream MARL decision-making process.
  • Figure 2: (a) An illustration of an intersection, which consists of roads, lanes, and traffic movements. (b) Eight non-conflicting traffic signal phases at the intersection. (c) Depiction of the traffic signal control process based on phase selections. (d) Illustration of the state of a traffic movement for RL agents.
  • Figure 3: Architecture of the proposed LLM-assisted teacher-student learning framework (LATS) for general TSC. The teacher uses a pre-trained embedding LLM to generate semantic embedding vectors from customized traffic prompts (top left) for each phase. The student model emulates these vectors through knowledge distillation in the latent space. These latent vectors are integrated into the downstream RL decision-making process to enhance the phase representations.
  • Figure 4: A snapshot of a four-road intersection from the Grid 5$\times$5 network during the simulation for generating semantic features in LATS (left). The traffic prompt for phase $p0$ at the intersection, which contains the descriptions of static intersection topology and dynamic traffic conditions (right). The black text represents the unified prompt template, while the orange text indicates the filled-in intersection topology and traffic dynamics information.
  • Figure 5: Illustration of the three selected intersections (agent 2, agent 9, and agent 23) with diverse topology structures from the heterogeneous Monaco traffic network used for t-SNE visualizations (left). 2D t-SNE visualization of phase features for three selected intersections, comparing LATS and IPPO (LATS w/o TS) during training: (a) LATS after 800 episodes, (b) LATS after 1600 episodes, (c) IPPO after 800 episodes, and (d) IPPO after 1600 episodes (right).