Mutual Theory of Mind for Human-AI Communication
Qiaosi Wang, Ashok K. Goel
TL;DR
As AI systems acquire social inference abilities akin to Theory of Mind, this paper introduces Mutual Theory of Mind (MToM) to model the iterative, mutual shaping of human and AI interpretations in communication. The framework identifies three core elements—interpretation, feedback, and mutuality—and three stages—construction, recognition, and revision—along which ToM-like reasoning is developed and refined. Two empirical studies in online learning demonstrate how linguistic cues can help an AI construct users' perceptions and how users react to AI misrepresentations, informing design of adaptive perception modeling and repair strategies. The work offers a path toward more trustworthy, responsive, and socially adept human-AI interactions by guiding feedback design, perception management, and misrepresentation repair across diverse contexts.
Abstract
New developments are enabling AI systems to perceive, recognize, and respond with social cues based on inferences made from humans' explicit or implicit behavioral and verbal cues. These AI systems, equipped with an equivalent of human's Theory of Mind (ToM) capability, are currently serving as matchmakers on dating platforms, assisting student learning as teaching assistants, and enhancing productivity as work partners. They mark a new era in human-AI interaction (HAI) that diverges from traditional human-computer interaction (HCI), where computers are commonly seen as tools instead of social actors. Designing and understanding the human perceptions and experiences in this emerging HAI era becomes an urgent and critical issue for AI systems to fulfill human needs and mitigate risks across social contexts. In this paper, we posit the Mutual Theory of Mind (MToM) framework, inspired by our capability of ToM in human-human communications, to guide this new generation of HAI research by highlighting the iterative and mutual shaping nature of human-AI communication. We discuss the motivation of the MToM framework and its three key components that iteratively shape the human-AI communication in three stages. We then describe two empirical studies inspired by the MToM framework to demonstrate the power of MToM in guiding the design and understanding of human-AI communication. Finally, we discuss future research opportunities in human-AI interaction through the lens of MToM.
