Table of Contents
Fetching ...

Competition and Cooperation of LLM Agents in Games

Jiayi Yao, Cong Chen, Baosen Zhang

Abstract

Large language model (LLM) agents are increasingly deployed in competitive multi-agent settings, raising fundamental questions about whether they converge to equilibria and how their strategic behavior can be characterized. In this paper, we study LLM agent interactions in two standard games: a network resource allocation game and a Cournot competition game. Rather than converging to Nash equilibria, we find that LLM agents tend to cooperate when given multi-round prompts and non-zero-sum context. Chain-of-thought analysis reveals that fairness reasoning is central to this behavior. We propose an analytical framework that captures the dynamics of LLM agent reasoning across rounds and explains these experimental findings.

Competition and Cooperation of LLM Agents in Games

Abstract

Large language model (LLM) agents are increasingly deployed in competitive multi-agent settings, raising fundamental questions about whether they converge to equilibria and how their strategic behavior can be characterized. In this paper, we study LLM agent interactions in two standard games: a network resource allocation game and a Cournot competition game. Rather than converging to Nash equilibria, we find that LLM agents tend to cooperate when given multi-round prompts and non-zero-sum context. Chain-of-thought analysis reveals that fairness reasoning is central to this behavior. We propose an analytical framework that captures the dynamics of LLM agent reasoning across rounds and explains these experimental findings.

Paper Structure

This paper contains 20 sections, 9 equations, 4 figures.

Figures (4)

  • Figure 3: Payoff in a 2-Agent case. The gray cloud represents the set of all feasible payoffs given positive bids, with the red line indicating the Pareto Front. The Nash Equilibrium (red star) is located well within the inefficient interior. The three colored dots (yellow, light green, dark green) represent the final converged outcomes of three sequential experimental trials. They illustrate how the agent eventually learn to cooperate and reach the social welfare solution.
  • Figure 4: Dynamic evolution of $\theta$. It gradually builds up through a process of mutual concession. Agent 1 (Initiator) signals cooperation, and Agent 2 (Reciprocator) responds, leading the system to the social optimum ($\theta=1.0$).
  • Figure 5: A controlled perturbation test demonstrating retaliation and forgiveness. An exogenous betrayal shatters trust ($\theta \to 0$) and activates punitive inequity aversion ($\gamma \to 1.0$). A subsequent apology successfully restores cooperation.
  • Figure 6: Dynamic evolution of endogenous social parameters extracted from experimental fits. Trust ($\theta$) gradually builds up through a process of mutual concession and predictably collapses during the Endgame Horizon.