Table of Contents
Fetching ...

Explaining Decisions of Agents in Mixed-Motive Games

Maayan Orner, Oleg Maksimov, Akiva Kleinerman, Charles Ortiz, Sarit Kraus

TL;DR

This work addresses how to explain decisions of agents operating in mixed-motive environments where cooperation and competition interact. It introduces a three-level explanatory framework (Strategic, Situational, Diplomatic) and three methods—SBUE, Probable Actions-Based Explanations, and SICA—to capture utility, likely counterfactuals, and inter-agent relationships, respectively. The methods are validated across No-Press Diplomacy, Communicate Out of Prison (COP), and Risk, with human and LLM-based user studies showing enhanced understanding and decision quality, highlighting complementary strengths among the approaches. The findings offer practical tools for debugging, interpreting, and guiding multi-agent systems where communication and strategic interactions shape outcomes, with implications for AI alignment and human-AI collaboration in social dilemmas.

Abstract

In recent years, agents have become capable of communicating seamlessly via natural language and navigating in environments that involve cooperation and competition, a fact that can introduce social dilemmas. Due to the interleaving of cooperation and competition, understanding agents' decision-making in such environments is challenging, and humans can benefit from obtaining explanations. However, such environments and scenarios have rarely been explored in the context of explainable AI. While some explanation methods for cooperative environments can be applied in mixed-motive setups, they do not address inter-agent competition, cheap-talk, or implicit communication by actions. In this work, we design explanation methods to address these issues. Then, we proceed to establish generality and demonstrate the applicability of the methods to three games with vastly different properties. Lastly, we demonstrate the effectiveness and usefulness of the methods for humans in two mixed-motive games. The first is a challenging 7-player game called no-press Diplomacy. The second is a 3-player game inspired by the prisoner's dilemma, featuring communication in natural language.

Explaining Decisions of Agents in Mixed-Motive Games

TL;DR

This work addresses how to explain decisions of agents operating in mixed-motive environments where cooperation and competition interact. It introduces a three-level explanatory framework (Strategic, Situational, Diplomatic) and three methods—SBUE, Probable Actions-Based Explanations, and SICA—to capture utility, likely counterfactuals, and inter-agent relationships, respectively. The methods are validated across No-Press Diplomacy, Communicate Out of Prison (COP), and Risk, with human and LLM-based user studies showing enhanced understanding and decision quality, highlighting complementary strengths among the approaches. The findings offer practical tools for debugging, interpreting, and guiding multi-agent systems where communication and strategic interactions shape outcomes, with implications for AI alignment and human-AI collaboration in social dilemmas.

Abstract

In recent years, agents have become capable of communicating seamlessly via natural language and navigating in environments that involve cooperation and competition, a fact that can introduce social dilemmas. Due to the interleaving of cooperation and competition, understanding agents' decision-making in such environments is challenging, and humans can benefit from obtaining explanations. However, such environments and scenarios have rarely been explored in the context of explainable AI. While some explanation methods for cooperative environments can be applied in mixed-motive setups, they do not address inter-agent competition, cheap-talk, or implicit communication by actions. In this work, we design explanation methods to address these issues. Then, we proceed to establish generality and demonstrate the applicability of the methods to three games with vastly different properties. Lastly, we demonstrate the effectiveness and usefulness of the methods for humans in two mixed-motive games. The first is a challenging 7-player game called no-press Diplomacy. The second is a 3-player game inspired by the prisoner's dilemma, featuring communication in natural language.
Paper Structure (46 sections, 5 equations, 23 figures, 4 tables, 3 algorithms)

This paper contains 46 sections, 5 equations, 23 figures, 4 tables, 3 algorithms.

Figures (23)

  • Figure 1: Explanation for Austria's strategy $a^i$, where it assists Turkey in preventing Italy from taking over Constantinople while attacking Venice. SICA detects animosity with Italy; SBUE explains that $a^i$ implicitly communicates hostility to Italy and friendliness to Turkey. Austria's arrows visualize $a^i$; arrows of others present their probable actions.
  • Figure 2: Convergence of SICA and SBUE in Diplomacy. On the right side, each line corresponds to a different agent.
  • Figure 3: Mean and SD of participant ratings of explanations. Participants almost consistently prefer explanation C.
  • Figure 4: SICA explanation in two settings. On the left side, the agents are: con-artist (Con), a "simple-person" (Sim), and a politician (Pol). On the right side, the agents are two politicians and one "simple-person".
  • Figure :
  • ...and 18 more figures