Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning

Hao Qin; Weishi Zhang

Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning

Hao Qin, Weishi Zhang

TL;DR

The paper tackles inefficiencies in transit signal priority by proposing CBQL-TSP, an eight-phase TSP framework that fuses cooperative game theory with reinforcement learning. By modeling signal decisions as a multi-agent MDP and using Shapley values to allocate marginal contributions, the approach informs state transitions and action selection to balance bus reliability with private-vehicle efficiency. Through CTM-based traffic modeling and PARAMICS simulations on a five-intersection network with bus lanes, CBQL-TSP demonstrates improved stability and reduced bus transit times (roughly a 24.6% system-wide transit-time reduction and a 37.4% bus-time reduction in the city center) compared with MB-TSP, MP-TSP, ASC-TSP, and no-TSP baselines. The work advances practical TSP by providing a fair, adaptive, and scalable control method that harmonizes public transit performance with general traffic flow, with future directions including multi-level priority schemes and data-driven fine-grained control.

Abstract

To address the low efficiency in priority signal control within intelligent transportation systems, this study introduces a novel eight-phase priority signal control method, CBQL-TSP, leveraging a hybrid decision-making framework that integrates cooperative game theory and reinforcement learning. This approach conceptualizes the allocation of bus signal priorities as a multi-objective decision-making problem across an eight-phase signal sequence, differentiating between priority and non-priority phases. It employs a cooperative game model to facilitate this differentiation. The developed hybrid decision-making algorithm, CBQL, effectively tackles the multi-objective decision-making challenges inherent in the eight-phase signal sequence. By computing the Shapley value function, it quantifies the marginal contributions of each participant, which in turn inform the construction of a state transition probability equation based on Shapley value ratios. Compared to conventional control methods, the CBQL-TSP method not only upholds the fairness principles of cooperative game theory but also harnesses the adaptive learning capabilities of Q-Learning. This enables dynamic adjustments to signal timing in response to real-time traffic conditions, significantly enhancing the flexibility and efficiency of priority signal control.

Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning

TL;DR

Abstract

Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (11)