Overcoming the Machine Penalty with Imperfectly Fair AI Agents
Zhen Wang, Ruiqi Song, Chen Shen, Shiya Yin, Zhao Song, Balaraju Battu, Lei Shi, Danyang Jia, Talal Rahwan, Shuyue Hu
TL;DR
This paper tackles the longstanding machine penalty by testing whether AI agents powered by large language models can foster human cooperation in social dilemmas. By assigning AI agents three personas—cooperative, fair, and selfish—and conducting a large preregistered study with 1,152 participants in a ten-round prisoner's dilemma with pre-game communication, the authors show that only the fair persona overcomes the machine penalty, achieving cooperation rates comparable to human–human interactions. The results reveal that fair agents promote cooperative norms, even when they occasionally break promises, and are perceived by humans as possessing agency and intelligence similar to or greater than humans in certain dimensions. The study emphasizes that success hinges on embedding AI with human-like social-cognitive intelligence rather than superficial anthropomorphism, with broad implications for designing AI that can effectively partner with humans in complex social contexts.
Abstract
Despite rapid technological progress, effective human-machine cooperation remains a significant challenge. Humans tend to cooperate less with machines than with fellow humans, a phenomenon known as the machine penalty. Here, we show that artificial intelligence (AI) agents powered by large language models can overcome this penalty in social dilemma games with communication. In a pre-registered experiment with 1,152 participants, we deploy AI agents exhibiting three distinct personas: selfish, cooperative, and fair. However, only fair agents elicit human cooperation at rates comparable to human-human interactions. Analysis reveals that fair agents, similar to human participants, occasionally break pre-game cooperation promises, but nonetheless effectively establish cooperation as a social norm. These results challenge the conventional wisdom of machines as altruistic assistants or rational actors. Instead, our study highlights the importance of AI agents reflecting the nuanced complexity of human social behaviors -- imperfect yet driven by deeper social cognitive processes.
