Beyond World Models: Rethinking Understanding in AI Models
Tarun Gupta, Danish Pruthi
TL;DR
The paper questions whether internal world-model representations equate to human-like understanding in AI. It uses three philosophical case studies—domino-based computation, mathematical proofs, and Bohr's atomic theory—to argue that world-models miss crucial aspects of understanding, such as abstract reasoning, justification, and problem-situation explanations. It argues that while world-models capture certain predictive dynamics, they do not suffice to explain why certain steps or mechanisms are meaningful, or how overarching insights are generated. It calls for theoretical frameworks that go beyond state- and transition-focused representations to capture full understanding and explainability.
Abstract
World models have garnered substantial interest in the AI community. These are internal representations that simulate aspects of the external world, track entities and states, capture causal relationships, and enable prediction of consequences. This contrasts with representations based solely on statistical correlations. A key motivation behind this research direction is that humans possess such mental world models, and finding evidence of similar representations in AI models might indicate that these models "understand" the world in a human-like way. In this paper, we use case studies from the philosophy of science literature to critically examine whether the world model framework adequately characterizes human-level understanding. We focus on specific philosophical analyses where the distinction between world model capabilities and human understanding is most pronounced. While these represent particular views of understanding rather than universal definitions, they help us explore the limits of world models.
