Table of Contents
Fetching ...

Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

Xiaocai Zhang, Neema Nassir, Milad Haghani

TL;DR

The proposed STDSH-MARL (Spatio-Temporal Dual-Stage Hypergraph based Multi-Agent Reinforcement Learning), a scalable multi-agent deep reinforcement learning framework that follows a centralized training and decentralized execution paradigm, achieves superior overall performance.

Abstract

Human-centric traffic signal control in corridor networks must increasingly account for multimodal travelers, particularly high-occupancy public transportation, rather than focusing solely on vehicle-centric performance. This paper proposes STDSH-MARL (Spatio-Temporal Dual-Stage Hypergraph based Multi-Agent Reinforcement Learning), a scalable multi-agent deep reinforcement learning framework that follows a centralized training and decentralized execution paradigm. The proposed method captures spatio-temporal dependencies through a novel dual-stage hypergraph attention mechanism that models interactions across both spatial and temporal hyperedges. In addition, a hybrid discrete action space is introduced to jointly determine the next signal phase configuration and its corresponding green duration, enabling more adaptive signal timing decisions. Experiments conducted on a corridor network under five traffic scenarios demonstrate that STDSH-MARL consistently improves multimodal performance and provides clear benefits for public transportation priority. Compared with state-of-the-art baseline methods, the proposed approach achieves superior overall performance. Further ablation studies confirm the contribution of each component of STDSH-MARL, with temporal hyperedges identified as the most influential factor driving the observed performance gains.

Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

TL;DR

The proposed STDSH-MARL (Spatio-Temporal Dual-Stage Hypergraph based Multi-Agent Reinforcement Learning), a scalable multi-agent deep reinforcement learning framework that follows a centralized training and decentralized execution paradigm, achieves superior overall performance.

Abstract

Human-centric traffic signal control in corridor networks must increasingly account for multimodal travelers, particularly high-occupancy public transportation, rather than focusing solely on vehicle-centric performance. This paper proposes STDSH-MARL (Spatio-Temporal Dual-Stage Hypergraph based Multi-Agent Reinforcement Learning), a scalable multi-agent deep reinforcement learning framework that follows a centralized training and decentralized execution paradigm. The proposed method captures spatio-temporal dependencies through a novel dual-stage hypergraph attention mechanism that models interactions across both spatial and temporal hyperedges. In addition, a hybrid discrete action space is introduced to jointly determine the next signal phase configuration and its corresponding green duration, enabling more adaptive signal timing decisions. Experiments conducted on a corridor network under five traffic scenarios demonstrate that STDSH-MARL consistently improves multimodal performance and provides clear benefits for public transportation priority. Compared with state-of-the-art baseline methods, the proposed approach achieves superior overall performance. Further ablation studies confirm the contribution of each component of STDSH-MARL, with temporal hyperedges identified as the most influential factor driving the observed performance gains.
Paper Structure (34 sections, 25 equations, 10 figures, 13 tables)

This paper contains 34 sections, 25 equations, 10 figures, 13 tables.

Figures (10)

  • Figure 1: Overview of the proposed STDSH-MARL framework for multimodal corridor traffic signal control.
  • Figure 2: Overview of STDSH-MARL: DSHA learns a hypergraph-level embedding from spatial and temporal hyperedges for a centralized critic, while decentralized agents act from local state representations in a VISSIM-simulated corridor.
  • Figure 3: Coverage of the approaching lanes for the example intersection B among a corridor network. Green stands for the vehicle lanes and pink denotes tram lanes. It's based on left-hand traffic countries.
  • Figure 4: Phase configurations for action selection.
  • Figure 5: Partial view of the constructed corridor network in VISSIM for experimental evaluation.
  • ...and 5 more figures