The evolution of trust as a cognitive shortcut in repeated interactions
Cedric Perret, The Anh Han, Elias Fernández Domingos, Theodor Cimpeanu, Simon T. Powers
TL;DR
Trust and cooperation are intertwined in social interactions, but existing models confounded trust with cooperative outcomes. The authors formalize trust as a cognitive shortcut that reduces costly verification in repeated games and analyze trust-based strategies across the full space of symmetric two-player social dilemmas using evolutionary game theory. They show that trust-based strategies can outcompete Tit-for-Tat when verification is costly and errors occur, and that trust generally increases population-level cooperation, especially under high temptation to defect. The work provides a formal, observable measure of trust applicable to humans and AI, with implications for AI alignment and auditing regimes.
Abstract
Trust is often thought to increase cooperation. However, game-theoretic models often fail to distinguish between cooperative behaviour and trust. This makes it difficult to measure trust and determine its effect in different social dilemmas. We address this here by formalising trust as a cognitive shortcut in repeated games. This functions by avoiding checking a partner's actions once a threshold level of cooperativeness has been observed. We consider trust-based strategies that implement this heuristic, and systematically analyse their evolution across the space of two-player symmetric social dilemma games. We find that where it is costly to check whether another agent's actions were cooperative, as is the case in many real-world settings, then trust-based strategies can outcompete standard reciprocal strategies such as Tit-for-Tat in many social dilemmas. Moreover, the presence of trust increases the overall level of cooperation in the population, especially in cases where agents can make unintentional errors in their actions. This occurs even in the presence of strategies designed to build and then exploit trust. Overall, our results demonstrate the individual adaptive benefit to an agent of using a trust heuristic, and provide a formal theory for how trust can promote cooperation in different types of social interaction. We discuss the implications of this for interactions between humans and artificial intelligence agents.
