Language Games as the Pathway to Artificial Superhuman Intelligence
Ying Wen, Ziyu Wan, Shao Zhang
TL;DR
This paper tackles the data reproduction trap in large language models by proposing language games as an open-ended framework for expanded data reproduction. It formalizes a triad of design principles—role fluidity, reward variety, and rule plasticity—to sustain distributional shift and foster emergent capabilities toward artificial superhuman intelligence. By scaling these games to global sociotechnical ecosystems, the authors argue for a planetary data flywheel driven by human-AI co-evolution, with three pathways (cross-cultural fusion, distributed proof markets, and consensus reality engineering) and adaptive governance to guide safe growth. The work also discusses limitations, including expressive gaps, diffusion risks, oligopoly dynamics, and epistemological shifts, emphasizing governance and multidisciplinary oversight to realign intelligence growth with societal values.
Abstract
The evolution of large language models (LLMs) toward artificial superhuman intelligence (ASI) hinges on data reproduction, a cyclical process in which models generate, curate and retrain on novel data to refine capabilities. Current methods, however, risk getting stuck in a data reproduction trap: optimizing outputs within fixed human-generated distributions in a closed loop leads to stagnation, as models merely recombine existing knowledge rather than explore new frontiers. In this paper, we propose language games as a pathway to expanded data reproduction, breaking this cycle through three mechanisms: (1) \textit{role fluidity}, which enhances data diversity and coverage by enabling multi-agent systems to dynamically shift roles across tasks; (2) \textit{reward variety}, embedding multiple feedback criteria that can drive complex intelligent behaviors; and (3) \textit{rule plasticity}, iteratively evolving interaction constraints to foster learnability, thereby injecting continual novelty. By scaling language games into global sociotechnical ecosystems, human-AI co-evolution generates unbounded data streams that drive open-ended exploration. This framework redefines data reproduction not as a closed loop but as an engine for superhuman intelligence.
