Causal Explanations for Sequential Decision-Making in Multi-Agent Systems
Balint Gyevnar, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht
TL;DR
CEMA presents a general framework for generating causal natural language explanations for ego-agent decisions in dynamic multi-agent systems by exploiting probabilistic forward models to simulate counterfactual worlds. It operationalizes causal selection via the Counterfactual Effect Size Model and distinguishes teleological and mechanistic explanations without relying on fixed structural causal graphs. The approach is demonstrated in autonomous driving motion planning, including implementation details with a defined feature set, forward simulations, and deterministic natural language realization, plus a large user study and the HEADD dataset. The work shows that CEMA can robustly identify salient causes across many agents and improves perceived trust in autonomous vehicles, offering a practical path toward trustworthy social XAI in sequential multi-agent settings.
Abstract
We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents. Unlike prior work that assumes a fixed causal structure, CEMA only requires a probabilistic model for forward-simulating the state of the system. Using such a model, CEMA simulates counterfactual worlds that identify the salient causes behind the agent's decisions. We evaluate CEMA on the task of motion planning for autonomous driving and test it in diverse simulated scenarios. We show that CEMA correctly and robustly identifies the causes behind the agent's decisions, even when a large number of other agents is present, and show via a user study that CEMA's explanations have a positive effect on participants' trust in autonomous vehicles and are rated as high as high-quality baseline explanations elicited from other participants. We release the collected explanations with annotations as the HEADD dataset.
