Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier
Zachary Wojtowicz, Simon DeDeo
TL;DR
The paper argues that cheap AI-enabled thinking can undermine social cooperation by eroding mental proofs—observable actions that certify unobservable mental states like knowledge and intentions—in low-trust settings. It formalizes mental proof through two mechanisms—signaling theory and proofs of knowledge—and uses worked examples such as sincere apologies and social proof to illustrate how AI disrupts trust-building processes. It shows that AI can flatten cost differentials or enable deceptive replicas, weakening coordination and collective action, especially for those without strong institutions. The authors propose policy and design responses, including distinguishing AI-assisted from human-authored content and developing trust-enhancing protocols to preserve the social value of mental proofs while leveraging AI’s benefits.
Abstract
Large language models and other highly capable AI systems ease the burdens of deciding what to say or do, but this very ease can undermine the effectiveness of our actions in social contexts. We explain this apparent tension by introducing the integrative theoretical concept of "mental proof," which occurs when observable actions are used to certify unobservable mental facts. From hiring to dating, mental proofs enable people to credibly communicate values, intentions, states of knowledge, and other private features of their minds to one another in low-trust environments where honesty cannot be easily enforced. Drawing on results from economics, theoretical biology, and computer science, we describe the core theoretical mechanisms that enable people to effect mental proofs. An analysis of these mechanisms clarifies when and how artificial intelligence can make low-trust cooperation harder despite making thinking easier.
