MARLens: Understanding Multi-agent Reinforcement Learning for Traffic Signal Control via Visual Analytics

Yutian Zhang; Guohong Zheng; Zhiyuan Liu; Quan Li; Haipeng Zeng

MARLens: Understanding Multi-agent Reinforcement Learning for Traffic Signal Control via Visual Analytics

Yutian Zhang, Guohong Zheng, Zhiyuan Liu, Quan Li, Haipeng Zeng

TL;DR

This study delves into the challenge of interpretability in multi-agent reinforcement learning (MARL), particularly within the context of TSC, and proposes MARLens, a visual analytics system tailored to understand MARL-based TSC.

Abstract

The issue of traffic congestion poses a significant obstacle to the development of global cities. One promising solution to tackle this problem is intelligent traffic signal control (TSC). Recently, TSC strategies leveraging reinforcement learning (RL) have garnered attention among researchers. However, the evaluation of these models has primarily relied on fixed metrics like reward and queue length. This limited evaluation approach provides only a narrow view of the model's decision-making process, impeding its practical implementation. Moreover, effective TSC necessitates coordinated actions across multiple intersections. Existing visual analysis solutions fall short when applied in multi-agent settings. In this study, we delve into the challenge of interpretability in multi-agent reinforcement learning (MARL), particularly within the context of TSC. We propose MARLens a visual analytics system tailored to understand MARL-based TSC. Our system serves as a versatile platform for both RL and TSC researchers. It empowers them to explore the model's features from various perspectives, revealing its decision-making processes and shedding light on interactions among different agents. To facilitate quick identification of critical states, we have devised multiple visualization views, complemented by a traffic simulation module that allows users to replay specific training scenarios. To validate the utility of our proposed system, we present three comprehensive case studies, incorporate insights from domain experts through interviews, and conduct a user study. These collective efforts underscore the feasibility and effectiveness of MARLens in enhancing our understanding of MARL-based TSC systems and pave the way for more informed and efficient traffic management strategies.

MARLens: Understanding Multi-agent Reinforcement Learning for Traffic Signal Control via Visual Analytics

TL;DR

Abstract

Paper Structure (32 sections, 1 equation, 8 figures, 1 table)

This paper contains 32 sections, 1 equation, 8 figures, 1 table.

Introduction
Introduction
Related Work
Reinforcement Learning for Traffic Signal Control
Interpretability of Reinforcement Learning
Preliminaries
RL-based TSC Scenario
Multi-Agent Deep Deterministic Policy Gradient
Observational Study
Experts' Current Practices and Bottlenecks
Experts' Needs and Expectations
Overview of MARLens
Back-end Engine of MARLens
Scene Initialization
Data Description
...and 17 more sections

Figures (8)

Figure 1: The system pipeline of MARLens. In the back-end engine, we extract critical information about the agents’ behavior, relationships, and decision-making processes from three distinct types of data collected. In the front-end visualization, we offer five coordinated views with rich interactions to facilitate exploration and comprehension of the MARL model.
Figure 2: Our visualization system, MARLens, provides an in-depth analysis of MARL models in TSC scenarios. The Control Panel (A) presents parameters in model training and model testing. The Training Distribution (B) provides the distribution of the metrics and ranks the episode based on the metrics. The Episode Overview(C) presents a summary of traffic conditions and each agent's policy at a certain episode. The Episode Detail (D) provides a visual summary for each agent in an episode, including information of the state, action, and selected metrics, and demonstrates relationships among multiple agents. The Policy Explainer (E) provides explanations between local state and action, global information and critic value. The Simulation Replay (F) supports the replay of an arbitrary episode or time step in the simulation situation. The Snapshot Log (G) saves the snapshots of the Policy Explainer.
Figure 3: Glyph design and interaction in the Episode Detail. (a) When hovering the mouse over the bar charts (feature importance), the corresponding feature's name is displayed in the center. (b) Utilization of a chord diagram to illustrate the relationships among agents. (c) The glyph design showcasing state information, employing distinct rings to represent various aspects of the episode, such as time step, traffic flow, action, and reward.
Figure 4: Various design alternatives were evaluated for the components within the Episode Detail. (a) Represents a commonly employed design in TSC for displaying traffic signal phases. (b) Introduces an alternative design aimed at conserving space and facilitating the comparison of agents' actions. (c) Depicts the current design, employing icons to directly illustrate different traffic signal phases. Line charts are utilized to concurrently compare metrics for different agents.
Figure 5: Design alternatives were considered for the Policy Explainer component. (a) One design approach involved presenting feature names and their value ranges directly within each branch, supplemented by accompanying text. (b) Another design utilized distribution representations but did not differentiate between branches for different agents or vary branch thickness to indicate the number of rules. (c) The current design employs distribution representations with distinct colors for each agent's branches and varying branch thickness to visually represent the number of rules.
...and 3 more figures

MARLens: Understanding Multi-agent Reinforcement Learning for Traffic Signal Control via Visual Analytics

TL;DR

Abstract

MARLens: Understanding Multi-agent Reinforcement Learning for Traffic Signal Control via Visual Analytics

Authors

TL;DR

Abstract

Table of Contents

Figures (8)