Contrastive explanations of BDI agents
Michael Winikoff
TL;DR
This work extends Belief-Desire-Intention (BDI) explainability to contrastive explanations, enabling agents to justify actions as alternatives to potential foils. It combines a computational evaluation showing substantial reductions in explanation length with a human-subject study demonstrating mixed preferences but generally improved trust and perceived understanding for contrastive explanations. The approach formalizes contrastive factors through modified ancestry predicates and foil-aware filtering, and demonstrates scalability across randomized goal-plan trees and standard IPC scenarios. Practical implications emphasize foil alignment, cautious deployment, and the value of iterative user-centered evaluation for robust explainability. The findings suggest contrastive explanations can enhance trust and comprehension in autonomous agents, albeit with scenario- and user-expectation dependencies that warrant careful design and further study.
Abstract
The ability of autonomous systems to provide explanations is important for supporting transparency and aiding the development of (appropriate) trust. Prior work has defined a mechanism for Belief-Desire-Intention (BDI) agents to be able to answer questions of the form ``why did you do action $X$?''. However, we know that we ask \emph{contrastive} questions (``why did you do $X$ \emph{instead of} $F$?''). We therefore extend previous work to be able to answer such questions. A computational evaluation shows that using contrastive questions yields a significant reduction in explanation length. A human subject evaluation was conducted to assess whether such contrastive answers are preferred, and how well they support trust development and transparency. We found some evidence for contrastive answers being preferred, and some evidence that they led to higher trust, perceived understanding, and confidence in the system's correctness. We also evaluated the benefit of providing explanations at all. Surprisingly, there was not a clear benefit, and in some situations we found evidence that providing a (full) explanation was worse than not providing any explanation.
