MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

Yuhang Zhang; Marcos Quinones-Grueiro; Zhiyao Zhang; Yanbing Wang; William Barbour; Gautam Biswas; Daniel Work

MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

Yuhang Zhang, Marcos Quinones-Grueiro, Zhiyao Zhang, Yanbing Wang, William Barbour, Gautam Biswas, Daniel Work

TL;DR

This work proposes MARVEL (Multi-Agent Reinforcement-learning for large-scale Variable Speed Limit), a novel framework for large-scale VSL control on highway corridors with real-world deployment settings that incorporates adaptability to traffic conditions, safety, and mobility, thereby enabling multi-agent coordination.

Abstract

Variable Speed Limit (VSL) control acts as a promising highway traffic management strategy with worldwide deployment, which can enhance traffic safety by dynamically adjusting speed limits according to real-time traffic conditions. Most of the deployed VSL control algorithms so far are rule-based, lacking generalizability under varying and complex traffic scenarios. In this work, we propose MARVEL (Multi-Agent Reinforcement-learning for large-scale Variable spEed Limits), a novel framework for large-scale VSL control on highway corridors with real-world deployment settings. MARVEL utilizes only sensing information observable in the real world as state input and learns through a reward structure that incorporates adaptability to traffic conditions, safety, and mobility, thereby enabling multi-agent coordination. With parameter sharing among all VSL agents, the proposed framework scales to cover corridors with many agents. The policies are trained in a microscopic traffic simulation environment, focusing on a short freeway stretch with 8 VSL agents spanning 7 miles. For testing, these policies are applied to a more extensive network with 34 VSL agents spanning 17 miles of I-24 near Nashville, TN, USA. MARVEL-based method improves traffic safety by 63.4% compared to the no control scenario and enhances traffic mobility by 58.6% compared to a state-of-the-practice algorithm that has been deployed on I-24. Besides, we conduct an explainability analysis to examine the decision-making process of the agents and explore the learned policy under different traffic conditions. Finally, we test the response of the policy learned from the simulation-based experiments with real-world data collected from I-24 and illustrate its deployment capability.

MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

TL;DR

Abstract

Paper Structure (33 sections, 10 equations, 14 figures, 3 tables, 1 algorithm)

This paper contains 33 sections, 10 equations, 14 figures, 3 tables, 1 algorithm.

Introduction
Literature Review
VSL Performance
Control Methods
Rule-based and Feedback Control
Optimization and Predictive Control
Data-Driven and Learning-Based Control
Preliminaries
RL and MARL
Policy Optimization Methods
Methodology
Temporal and Spatial Sequential Decision Making
Agent
State Space
Action Space
...and 18 more sections

Figures (14)

Figure 1: The three traffic zones for VSL design.
Figure 2: The VSL control framework based on MAPPO algorithm.
Figure 3: The I-24 Smart Corridor segment with VSL control.
Figure 4: The learning curve of the total reward, adaption reward term, mobility reward term and safety reward term.
Figure 5: The time-space diagram of Scenario A. The left column shows the scenario under MARVEL-MAPPO control and the right column shows the scenario under speed-matching control. (Upper row: the average traffic speed from RDS units. Lower row: the control outputs from the algorithms. Note: the figure is trimmed in time to show the congestion better.)
...and 9 more figures

MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

TL;DR

Abstract

MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits

Authors

TL;DR

Abstract

Table of Contents

Figures (14)