Table of Contents
Fetching ...

Language Games as the Pathway to Artificial Superhuman Intelligence

Ying Wen, Ziyu Wan, Shao Zhang

TL;DR

This paper tackles the data reproduction trap in large language models by proposing language games as an open-ended framework for expanded data reproduction. It formalizes a triad of design principles—role fluidity, reward variety, and rule plasticity—to sustain distributional shift and foster emergent capabilities toward artificial superhuman intelligence. By scaling these games to global sociotechnical ecosystems, the authors argue for a planetary data flywheel driven by human-AI co-evolution, with three pathways (cross-cultural fusion, distributed proof markets, and consensus reality engineering) and adaptive governance to guide safe growth. The work also discusses limitations, including expressive gaps, diffusion risks, oligopoly dynamics, and epistemological shifts, emphasizing governance and multidisciplinary oversight to realign intelligence growth with societal values.

Abstract

The evolution of large language models (LLMs) toward artificial superhuman intelligence (ASI) hinges on data reproduction, a cyclical process in which models generate, curate and retrain on novel data to refine capabilities. Current methods, however, risk getting stuck in a data reproduction trap: optimizing outputs within fixed human-generated distributions in a closed loop leads to stagnation, as models merely recombine existing knowledge rather than explore new frontiers. In this paper, we propose language games as a pathway to expanded data reproduction, breaking this cycle through three mechanisms: (1) \textit{role fluidity}, which enhances data diversity and coverage by enabling multi-agent systems to dynamically shift roles across tasks; (2) \textit{reward variety}, embedding multiple feedback criteria that can drive complex intelligent behaviors; and (3) \textit{rule plasticity}, iteratively evolving interaction constraints to foster learnability, thereby injecting continual novelty. By scaling language games into global sociotechnical ecosystems, human-AI co-evolution generates unbounded data streams that drive open-ended exploration. This framework redefines data reproduction not as a closed loop but as an engine for superhuman intelligence.

Language Games as the Pathway to Artificial Superhuman Intelligence

TL;DR

This paper tackles the data reproduction trap in large language models by proposing language games as an open-ended framework for expanded data reproduction. It formalizes a triad of design principles—role fluidity, reward variety, and rule plasticity—to sustain distributional shift and foster emergent capabilities toward artificial superhuman intelligence. By scaling these games to global sociotechnical ecosystems, the authors argue for a planetary data flywheel driven by human-AI co-evolution, with three pathways (cross-cultural fusion, distributed proof markets, and consensus reality engineering) and adaptive governance to guide safe growth. The work also discusses limitations, including expressive gaps, diffusion risks, oligopoly dynamics, and epistemological shifts, emphasizing governance and multidisciplinary oversight to realign intelligence growth with societal values.

Abstract

The evolution of large language models (LLMs) toward artificial superhuman intelligence (ASI) hinges on data reproduction, a cyclical process in which models generate, curate and retrain on novel data to refine capabilities. Current methods, however, risk getting stuck in a data reproduction trap: optimizing outputs within fixed human-generated distributions in a closed loop leads to stagnation, as models merely recombine existing knowledge rather than explore new frontiers. In this paper, we propose language games as a pathway to expanded data reproduction, breaking this cycle through three mechanisms: (1) \textit{role fluidity}, which enhances data diversity and coverage by enabling multi-agent systems to dynamically shift roles across tasks; (2) \textit{reward variety}, embedding multiple feedback criteria that can drive complex intelligent behaviors; and (3) \textit{rule plasticity}, iteratively evolving interaction constraints to foster learnability, thereby injecting continual novelty. By scaling language games into global sociotechnical ecosystems, human-AI co-evolution generates unbounded data streams that drive open-ended exploration. This framework redefines data reproduction not as a closed loop but as an engine for superhuman intelligence.

Paper Structure

This paper contains 27 sections, 3 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: An iterative data reproduction framework powered by language games progressively refines model capabilities toward artificial superhuman intelligence. From left to right: (1) Data reproduction via closed‐loop optimization, (2) Expanded data reproduction via language games with evolving roles, rules and rich rewards, (3) Global language games driving continual adaptation and surpassing human‐level capabilities.

Theorems & Definitions (4)

  • Definition 2.1: Data Reproduction
  • Definition 2.2: Data Reproduction Trap
  • Definition 2.3
  • Definition 3.1: Language Games