Table of Contents
Fetching ...

Large Language Models as Agents in Two-Player Games

Yang Liu, Peng Sun, Hang Li

TL;DR

This paper reframes LLM training and alignment as learning within two-player language-based games, unifying pre-training, SFT, RLHF, prompting, and in-context learning under a game-theoretic, extensive-form framework. By mapping each training stage to agent-learning concepts and analyzing environments as zero-sum, cooperative, or mixed games, it offers a principled lens on data design, reward shaping, and long-horizon reasoning to mitigate hallucination and improve robustness. The authors discuss data structuring (e.g., Q-A, Q-C-A), meta-learning across tasks, and world-model considerations, and they extend the framework to adversarial and cooperative settings, including superhuman aspirations and red-teaming. The work aims to guide future research in LLM alignment, safety, and capability enhancement by bridging GT/RL/MAS insights with practical LLM training and prompting strategies, while highlighting open questions and potential societal impacts.

Abstract

By formally defining the training processes of large language models (LLMs), which usually encompasses pre-training, supervised fine-tuning, and reinforcement learning with human feedback, within a single and unified machine learning paradigm, we can glean pivotal insights for advancing LLM technologies. This position paper delineates the parallels between the training methods of LLMs and the strategies employed for the development of agents in two-player games, as studied in game theory, reinforcement learning, and multi-agent systems. We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games. This framework unveils innovative perspectives on the successes and challenges in LLM development, offering a fresh understanding of addressing alignment issues among other strategic considerations. Furthermore, our two-player game approach sheds light on novel data preparation and machine learning techniques for training LLMs.

Large Language Models as Agents in Two-Player Games

TL;DR

This paper reframes LLM training and alignment as learning within two-player language-based games, unifying pre-training, SFT, RLHF, prompting, and in-context learning under a game-theoretic, extensive-form framework. By mapping each training stage to agent-learning concepts and analyzing environments as zero-sum, cooperative, or mixed games, it offers a principled lens on data design, reward shaping, and long-horizon reasoning to mitigate hallucination and improve robustness. The authors discuss data structuring (e.g., Q-A, Q-C-A), meta-learning across tasks, and world-model considerations, and they extend the framework to adversarial and cooperative settings, including superhuman aspirations and red-teaming. The work aims to guide future research in LLM alignment, safety, and capability enhancement by bridging GT/RL/MAS insights with practical LLM training and prompting strategies, while highlighting open questions and potential societal impacts.

Abstract

By formally defining the training processes of large language models (LLMs), which usually encompasses pre-training, supervised fine-tuning, and reinforcement learning with human feedback, within a single and unified machine learning paradigm, we can glean pivotal insights for advancing LLM technologies. This position paper delineates the parallels between the training methods of LLMs and the strategies employed for the development of agents in two-player games, as studied in game theory, reinforcement learning, and multi-agent systems. We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games. This framework unveils innovative perspectives on the successes and challenges in LLM development, offering a fresh understanding of addressing alignment issues among other strategic considerations. Furthermore, our two-player game approach sheds light on novel data preparation and machine learning techniques for training LLMs.
Paper Structure (18 sections, 11 equations, 5 figures)

This paper contains 18 sections, 11 equations, 5 figures.

Figures (5)

  • Figure 1: LLMs can be viewed as agents participating in language-based games in the framework of reinforcement learning.
  • Figure 2: The Game Tree representation for the LLM formulation. Hollow circles: player one states, solid circles: player two states. On the trajectory (episode) ending at terminal state $z$, the representative states $s$, $s'$, $\tilde{s}$ and action $a$ are depicted, where $s \sqsubset \tilde{s}$, $(s,a) \sqsubset \tilde{s}$, etc. The active players are denoted as $P(s)=1$, $P(s')=2$, $P(\tilde{s})=2$. The edge $(s,a)$ leads to $s'$ according to the transition function $s'=T(s,a)$. The visiting probability $d^{\pi}(\tilde{s}|s,a)$, starting from the edge $(s,a)$ and reaching the node $\tilde{s}$, is given by equation \ref{['eq:cond-visit-prob']}. Similarly, $d^{\pi}(z)$ is given by equation \ref{['eq:decompose-s']}.
  • Figure 3: A fixed player $-i$ policy induces an MDP to player $i$, where the environmental dynamic is ${\rm Pr(s'|s,a)=d^{\pi}(s'|s,a)}$ by using the visiting probability equation \ref{['eq:cond-visit-prob']} and by noting that the underlying transition $T(\cdot,\cdot)$ is deterministic.
  • Figure 4: Example of token sequence in pre-training data. It can be interpreted as a series of state-action pairs resulting from the game of two RL agents.
  • Figure 5: Example of token sequence in SFT data. It can be interpreted as a series of state-action pairs resulting from the game of two RL agents.