Approximating Human Models During Argumentation-based Dialogues
Yinxu Tang, Stylianos Loukas Vasileiou, William Yeoh
TL;DR
The paper tackles the problem of explainable AI planning when the human user's mental model is uncertain and dynamic. It introduces a probabilistic human model $P_h$ over a shared propositional language and updates this distribution through argumentation-based dialogues using trust- and certainty-based uncertainties, with a prospect-theory-inspired weighting to map trust to argumentative probability and a Bayesian update to refine beliefs. The approach is evaluated via a human-subject study where participants interact with an AI assistant to judge venue suitability, demonstrating that the method can capture dynamics of human belief formation and improve perceived trust. The work contributes a principled framework for probabilistic model reconciliation in XAIP and provides empirical evidence that such adaptive, human-centric explanations can enhance user trust and satisfaction in real-world-like decision scenarios.
Abstract
Explainable AI Planning (XAIP) aims to develop AI agents that can effectively explain their decisions and actions to human users, fostering trust and facilitating human-AI collaboration. A key challenge in XAIP is model reconciliation, which seeks to align the mental models of AI agents and humans. While existing approaches often assume a known and deterministic human model, this simplification may not capture the complexities and uncertainties of real-world interactions. In this paper, we propose a novel framework that enables AI agents to learn and update a probabilistic human model through argumentation-based dialogues. Our approach incorporates trust-based and certainty-based update mechanisms, allowing the agent to refine its understanding of the human's mental state based on the human's expressed trust in the agent's arguments and certainty in their own arguments. We employ a probability weighting function inspired by prospect theory to capture the relationship between trust and perceived probability, and use a Bayesian approach to update the agent's probability distribution over possible human models. We conduct a human-subject study to empirically evaluate the effectiveness of our approach in an argumentation scenario, demonstrating its ability to capture the dynamics of human belief formation and adaptation.
