Expedient Assistance and Consequential Misunderstanding: Envisioning an Operationalized Mutual Theory of Mind
Justin D. Weisz, Michael Muller, Arielle Goldberg, Dario Andres Silva Moran
TL;DR
This paper uses design fiction to examine how a mutual theory of mind (MToM) between humans and AI could be operationalized in workplace contexts. The authors present three design fictions—ALways Happy to Help, Referral Roulette, and Aim High, Stuart—to explore utopian benefits and dystopian breakdowns of MToM in action. The work identifies core aspects of MToM (human model of AI knowledge, AI model of the human, and bidirectional updating) and discusses practical implications for trust, explainability, and governance. The findings highlight both productivity gains from proactive, context-aware AI and risks from overreliance, misaligned models, and multi-bot handoffs, urging careful design of signifiers, transparency, and domain-appropriate use.
Abstract
Design fictions allow us to prototype the future. They enable us to interrogate emerging or non-existent technologies and examine their implications. We present three design fictions that probe the potential consequences of operationalizing a mutual theory of mind (MToM) between human users and one (or more) AI agents. We use these fictions to explore many aspects of MToM, including how models of the other party are shaped through interaction, how discrepancies between these models lead to breakdowns, and how models of a human's knowledge and skills enable AI agents to act in their stead. We examine these aspects through two lenses: a utopian lens in which MToM enhances human-human interactions and leads to synergistic human-AI collaborations, and a dystopian lens in which a faulty or misaligned MToM leads to problematic outcomes. Our work provides an aspirational vision for human-centered MToM research while simultaneously warning of the consequences when implemented incorrectly.
