Multi-Agent Strategy Explanations for Human-Robot Collaboration

Ravi Pandya; Michelle Zhao; Changliu Liu; Reid Simmons; Henny Admoni

Multi-Agent Strategy Explanations for Human-Robot Collaboration

Ravi Pandya, Michelle Zhao, Changliu Liu, Reid Simmons, Henny Admoni

TL;DR

This work addresses the challenge of coordinating human-robot teams in settings with multiple Nash equilibria by introducing strategy-conditioned landmarks as a visual and textual explanation mechanism. It formalizes the problem as a HiP-MDP with a latent strategy space $\mathcal{Z}$, then clusters strategies, derives landmark states $S_{land}$, and generates explanations via an LLM to enable proactive strategy alignment. The method is evaluated in two domains, Collaborative Maze and Social Navigation, using Co-GAIL and ILQGames to obtain diverse strategy clusters; user studies show that landmark-based explanations improve zero-shot coordination and strategy exploration, with video-only baselines being less effective. The approach demonstrates a practical path toward proactive, explainable multi-agent collaboration, with future work focusing on scalability, home and real-world deployment, and safety considerations in physical robotics.

Abstract

As robots are deployed in human spaces, it is important that they are able to coordinate their actions with the people around them. Part of such coordination involves ensuring that people have a good understanding of how a robot will act in the environment. This can be achieved through explanations of the robot's policy. Much prior work in explainable AI and RL focuses on generating explanations for single-agent policies, but little has been explored in generating explanations for collaborative policies. In this work, we investigate how to generate multi-agent strategy explanations for human-robot collaboration. We formulate the problem using a generic multi-agent planner, show how to generate visual explanations through strategy-conditioned landmark states and generate textual explanations by giving the landmarks to an LLM. Through a user study, we find that when presented with explanations from our proposed framework, users are able to better explore the full space of strategies and collaborate more efficiently with new robot partners.

Multi-Agent Strategy Explanations for Human-Robot Collaboration

TL;DR

, then clusters strategies, derives landmark states

, and generates explanations via an LLM to enable proactive strategy alignment. The method is evaluated in two domains, Collaborative Maze and Social Navigation, using Co-GAIL and ILQGames to obtain diverse strategy clusters; user studies show that landmark-based explanations improve zero-shot coordination and strategy exploration, with video-only baselines being less effective. The approach demonstrates a practical path toward proactive, explainable multi-agent collaboration, with future work focusing on scalability, home and real-world deployment, and safety considerations in physical robotics.

Abstract

Paper Structure (22 sections, 4 equations, 11 figures)

This paper contains 22 sections, 4 equations, 11 figures.

Introduction
Related Work
Multi-Agent Multi-Strategy Games
Problem Formulation
Objective
Multi-Agent Strategy Alignment
Method: Strategy-Conditioned Landmarks
Strategy Clustering
Strategy Landmark State Generation
Textual Explanation Generation
Baseline: Video Explanation
Modification: Landmark Video Explanation
Collaborative Maze Task
Co-GAIL
Robustly Acting with Human Collaborators
...and 7 more sections

Figures (11)

Figure 1: Depiction of how strategy explanations can solve the strategy alignment problem. The task is for each agent to grab one diamond without collisions. Left: The human and robot start with mismatched strategies, and ultimately collide Right: The human and robot start with mismatched strategies, but the robot explains strategy $z_1$, aligning with the human and resulting in a successful rollout.
Figure 2: Pipeline for generating collaborative strategy explanations for a human. First, we identify strategies by clustering over the space of demonstrated examples. Then, we generate the set of landmark states to summarize each strategy cluster and generate textual descriptions with an LLM. Finally, the robot shows the human user the generated explanation before collaboration.
Figure 3: An explanation of one strategy generated on the collaborative maze task. Each image is a strategy landmark state computed from our method Sec. \ref{['sec:strategy_explanations']} and the text was generated from an LLM.
Figure 4: Left: Clustered latent strategy space for maze navigation. Right: Resulting strategies from each cluster center are a different ordering of the human and robot picking each jewel vs. standing on each button.
Figure 5: A the three-part user study for testing our approach. First, users are assigned an explanation type: landmarks, landmark videos, videos, or no explanation. For each of two environments, users play $N$ games with one explanation between each (or none in the no explanation case) and finally play one game with no prior communication with an agent that takes a random strategy.
...and 6 more figures

Theorems & Definitions (1)

Definition 1: Strategy-Conditioned Policy Landmark State

Multi-Agent Strategy Explanations for Human-Robot Collaboration

TL;DR

Abstract

Multi-Agent Strategy Explanations for Human-Robot Collaboration

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (1)