Table of Contents
Fetching ...

DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data

Hanyang Chen, Yang Jiang, Shengnan Guo, Xiaowei Mao, Youfang Lin, Huaiyu Wan

TL;DR

DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting, is introduced and a Diffusion Communication Mechanism (DCM) is proposed to promote better communication and control performance under data-missing scenarios.

Abstract

The application of reinforcement learning in traffic signal control (TSC) has been extensively researched and yielded notable achievements. However, most existing works for TSC assume that traffic data from all surrounding intersections is fully and continuously available through sensors. In real-world applications, this assumption often fails due to sensor malfunctions or data loss, making TSC with missing data a critical challenge. To meet the needs of practical applications, we introduce DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting. Specifically, we integrate two essential sub-tasks, i.e., traffic data imputation and decision-making, by leveraging a Partial Rewards Conditioned Diffusion (PRCD) model to prevent missing rewards from interfering with the learning process. Meanwhile, to effectively capture the spatial-temporal dependencies among intersections, we design a Spatial-Temporal transFormer (STFormer) architecture. In addition, we propose a Diffusion Communication Mechanism (DCM) to promote better communication and control performance under data-missing scenarios. Extensive experiments on five datasets with various data-missing scenarios demonstrate that DiffLight is an effective controller to address TSC with missing data. The code of DiffLight is released at https://github.com/lokol5579/DiffLight-release.

DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data

TL;DR

DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting, is introduced and a Diffusion Communication Mechanism (DCM) is proposed to promote better communication and control performance under data-missing scenarios.

Abstract

The application of reinforcement learning in traffic signal control (TSC) has been extensively researched and yielded notable achievements. However, most existing works for TSC assume that traffic data from all surrounding intersections is fully and continuously available through sensors. In real-world applications, this assumption often fails due to sensor malfunctions or data loss, making TSC with missing data a critical challenge. To meet the needs of practical applications, we introduce DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting. Specifically, we integrate two essential sub-tasks, i.e., traffic data imputation and decision-making, by leveraging a Partial Rewards Conditioned Diffusion (PRCD) model to prevent missing rewards from interfering with the learning process. Meanwhile, to effectively capture the spatial-temporal dependencies among intersections, we design a Spatial-Temporal transFormer (STFormer) architecture. In addition, we propose a Diffusion Communication Mechanism (DCM) to promote better communication and control performance under data-missing scenarios. Extensive experiments on five datasets with various data-missing scenarios demonstrate that DiffLight is an effective controller to address TSC with missing data. The code of DiffLight is released at https://github.com/lokol5579/DiffLight-release.

Paper Structure

This paper contains 58 sections, 21 equations, 4 figures, 15 tables.

Figures (4)

  • Figure 1: Illustration of a four-way intersection with 12 traffic movements and 4 traffic signal phases.
  • Figure 2: An overview of DiffLight. We demonstrate the signal control process of an intersection in random missing. Traffic data is collected by sensors to derive rewards and observations. Missing rewards and observations are masked. Only the observed part of the observation trajectory and observable rewards from the local intersection, and observation trajectories from neighboring intersections would be input into PRCD with STFormer. In the inference process, DCM would work with STFormer to generate observations. The inverse dynamics model is used to generate actions to control the traffic signal.
  • Figure 3: The relative generalization performance of DiffLight in different missing rates. The x-axis is the missing rate during testing and the y-axis is the missing rate during training. The formula used to calculate the relative generalization performance is depicted below.
  • Figure 4: Illustration of random missing and kriging missing pattern. Each node represents an intersection in the road network and blocks with masks are traffic data with missing value.