Table of Contents
Fetching ...

Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions

Ramira van der Meulen, Rineke Verbrugge, Max van Duijn

TL;DR

This paper interrogates four misconceptions about Theory of Mind (ToM) in AI, arguing that human ToM is not a single, modular function but a distributed set of processes that emerge from social interaction. It surveys robotics, developmental psychology, linguistics, neuroscience, and AI models to show that modular ToM simplifications misrepresent how humans reason about others, and that AI should leverage scripts, heuristics, and hybrid intelligence instead of fake or oversimplified ToM. The authors critically examine large language models, noting they can pass some ToM benchmarks due to data exposure rather than genuine social understanding, and advocate evaluating ToM in real-life, interactive settings. The practical takeaway is to design AI systems that explain their reasoning, ground with humans on shared knowledge, and combine AI’s memory and calculation strengths with human social intuition in a complementary, hybrid framework.

Abstract

The search for effective collaboration between humans and computer systems is one of the biggest challenges in Artificial Intelligence. One of the more effective mechanisms that humans use to coordinate with one another is theory of mind (ToM). ToM can be described as the ability to `take someone else's perspective and make estimations of their beliefs, desires and intentions, in order to make sense of their behaviour and attitudes towards the world'. If leveraged properly, this skill can be very useful in Human-AI collaboration. This introduces the question how we implement ToM when building an AI system. Humans and AI Systems work quite differently, and ToM is a multifaceted concept, each facet rooted in different research traditions across the cognitive and developmental sciences. We observe that researchers from artificial intelligence and the computing sciences, ourselves included, often have difficulties finding their way in the ToM literature. In this paper, we identify four common misconceptions around ToM that we believe should be taken into account when developing an AI system. We have hyperbolised these misconceptions for the sake of the argument, but add nuance in their discussion. The misconceptions we discuss are: (1) "Humans Use a ToM Module, So AI Systems Should As Well". (2) "Every Social Interaction Requires (Advanced) ToM". (3) "All ToM is the Same". (4) "Current Systems Already Have ToM". After discussing the misconception, we end each section by providing tentative guidelines on how the misconception can be overcome.

Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions

TL;DR

This paper interrogates four misconceptions about Theory of Mind (ToM) in AI, arguing that human ToM is not a single, modular function but a distributed set of processes that emerge from social interaction. It surveys robotics, developmental psychology, linguistics, neuroscience, and AI models to show that modular ToM simplifications misrepresent how humans reason about others, and that AI should leverage scripts, heuristics, and hybrid intelligence instead of fake or oversimplified ToM. The authors critically examine large language models, noting they can pass some ToM benchmarks due to data exposure rather than genuine social understanding, and advocate evaluating ToM in real-life, interactive settings. The practical takeaway is to design AI systems that explain their reasoning, ground with humans on shared knowledge, and combine AI’s memory and calculation strengths with human social intuition in a complementary, hybrid framework.

Abstract

The search for effective collaboration between humans and computer systems is one of the biggest challenges in Artificial Intelligence. One of the more effective mechanisms that humans use to coordinate with one another is theory of mind (ToM). ToM can be described as the ability to `take someone else's perspective and make estimations of their beliefs, desires and intentions, in order to make sense of their behaviour and attitudes towards the world'. If leveraged properly, this skill can be very useful in Human-AI collaboration. This introduces the question how we implement ToM when building an AI system. Humans and AI Systems work quite differently, and ToM is a multifaceted concept, each facet rooted in different research traditions across the cognitive and developmental sciences. We observe that researchers from artificial intelligence and the computing sciences, ourselves included, often have difficulties finding their way in the ToM literature. In this paper, we identify four common misconceptions around ToM that we believe should be taken into account when developing an AI system. We have hyperbolised these misconceptions for the sake of the argument, but add nuance in their discussion. The misconceptions we discuss are: (1) "Humans Use a ToM Module, So AI Systems Should As Well". (2) "Every Social Interaction Requires (Advanced) ToM". (3) "All ToM is the Same". (4) "Current Systems Already Have ToM". After discussing the misconception, we end each section by providing tentative guidelines on how the misconception can be overcome.

Paper Structure

This paper contains 23 sections.