Searching for Structure: Investigating Emergent Communication with Large Language Models
Tom Kouwenhoven, Max Peeperkorn, Tessa Verhoef
TL;DR
This study tests whether artificial languages evolve structure when optimized for large language models in a referential game. Using instruction tuned Llama 3 70B with in context prompts, two agents learn and gradually shape vocabularies, yielding increased structure and improved generalisation alongside nonhumanlike degeneracy. The results align with human findings on learnability and structure but reveal biases unique to LLM learners that can amplify degeneracy under iterated learning. Overall, the work demonstrates the feasibility of using LLMs to simulate language evolution and highlights directions for human–machine collaborative experiments to further probe structure formation.
Abstract
Human languages have evolved to be structured through repeated language learning and use. These processes introduce biases that operate during language acquisition and shape linguistic systems toward communicative efficiency. In this paper, we investigate whether the same happens if artificial languages are optimised for implicit biases of Large Language Models (LLMs). To this end, we simulate a classical referential game in which LLMs learn and use artificial languages. Our results show that initially unstructured holistic languages are indeed shaped to have some structural properties that allow two LLM agents to communicate successfully. Similar to observations in human experiments, generational transmission increases the learnability of languages, but can at the same time result in non-humanlike degenerate vocabularies. Taken together, this work extends experimental findings, shows that LLMs can be used as tools in simulations of language evolution, and opens possibilities for future human-machine experiments in this field.
