LLMs and people both learn to form conventions -- just not with each other
Cameron R. Jones, Agnese Lombardi, Kyle Mahowald, Benjamin K. Bergen
TL;DR
The paper investigates whether large language models (LLMs) can develop conversational conventions comparable to humans in a tangram referential game. It compares Human-Human, Human-AI, and AI-AI dyads across 50 rounds to assess convention formation, demonstrating that while all dyads show gains in accuracy and lexical consistency, only same-type pairs achieve humanlike convergence, with Human-AI lagging behind and AI-AI converging through different patterns. A second experiment tests a Humanlike prompting strategy to coax more humanlike behavior from AI, finding partial improvements in length and accuracy but persistent gaps in overlap and performance relative to human-human pairs. The results imply that true conversational alignment requires more than mimicry or in-context learning; it depends on shared interpretative biases and expectations about meaning, underscoring fundamental differences in human-AI coordination and informing how to design and evaluate interactive AI systems.
Abstract
Humans align to one another in conversation -- adopting shared conventions that ease communication. We test whether LLMs form the same kinds of conventions in a multimodal communication game. Both humans and LLMs display evidence of convention-formation (increasing the accuracy and consistency of their turns while decreasing their length) when communicating in same-type dyads (humans with humans, AI with AI). However, heterogenous human-AI pairs fail -- suggesting differences in communicative tendencies. In Experiment 2, we ask whether LLMs can be induced to behave more like human conversants, by prompting them to produce superficially humanlike behavior. While the length of their messages matches that of human pairs, accuracy and lexical overlap in human-LLM pairs continues to lag behind that of both human-human and AI-AI pairs. These results suggest that conversational alignment requires more than just the ability to mimic previous interactions, but also shared interpretative biases toward the meanings that are conveyed.
