Table of Contents
Fetching ...

Steering Language Models with Game-Theoretic Solvers

Ian Gemp, Roma Patel, Yoram Bachrach, Marc Lanctot, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, Siqi Liu, Karl Tuyls

TL;DR

This work bridges natural-language dialogue and game theory by framing interactive LLM communication as extensive-form games and applying equilibrium solvers to guide language generation. It maps dialogue to formal game-theoretic constructs, runs CFR and PSRO solvers, and evaluates solver-informed LLMs across scheduling, fruit trading, and public debate tasks using reward-model-based payoffs. The results show that solver-guided generations are less exploitable and often yield higher rewards, with larger LLMs improving reward fidelity and PSRO enabling discovery of novel strategies. It also demonstrates partial generalization of solver-derived policies to new domains via imitation learning and discusses practical limitations and broader societal implications of deploying strategic AI dialogue agents.

Abstract

Mathematical models of interactions among rational agents have long been studied in game theory. However these interactions are often over a small set of discrete game actions which is very different from how humans communicate in natural language. To bridge this gap, we introduce a framework that allows equilibrium solvers to work over the space of natural language dialogue generated by large language models (LLMs). Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory. Given this binding, we can ask existing game-theoretic algorithms to provide us with strategic solutions (e.g., what string an LLM should generate to maximize payoff in the face of strategic partners or opponents), giving us predictors of stable, rational conversational strategies. We focus on three domains that require different negotiation strategies: scheduling meetings, trading fruit and debate, and evaluate an LLM's generated language when guided by solvers. We see that LLMs that follow game-theory solvers result in dialogue generations that are less exploitable than the control (no guidance from solvers), and the language generated results in higher rewards, in all negotiation domains. We discuss future implications of this work, and how game-theoretic solvers that can leverage the expressivity of natural language can open up a new avenue of guiding language research.

Steering Language Models with Game-Theoretic Solvers

TL;DR

This work bridges natural-language dialogue and game theory by framing interactive LLM communication as extensive-form games and applying equilibrium solvers to guide language generation. It maps dialogue to formal game-theoretic constructs, runs CFR and PSRO solvers, and evaluates solver-informed LLMs across scheduling, fruit trading, and public debate tasks using reward-model-based payoffs. The results show that solver-guided generations are less exploitable and often yield higher rewards, with larger LLMs improving reward fidelity and PSRO enabling discovery of novel strategies. It also demonstrates partial generalization of solver-derived policies to new domains via imitation learning and discusses practical limitations and broader societal implications of deploying strategic AI dialogue agents.

Abstract

Mathematical models of interactions among rational agents have long been studied in game theory. However these interactions are often over a small set of discrete game actions which is very different from how humans communicate in natural language. To bridge this gap, we introduce a framework that allows equilibrium solvers to work over the space of natural language dialogue generated by large language models (LLMs). Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory. Given this binding, we can ask existing game-theoretic algorithms to provide us with strategic solutions (e.g., what string an LLM should generate to maximize payoff in the face of strategic partners or opponents), giving us predictors of stable, rational conversational strategies. We focus on three domains that require different negotiation strategies: scheduling meetings, trading fruit and debate, and evaluate an LLM's generated language when guided by solvers. We see that LLMs that follow game-theory solvers result in dialogue generations that are less exploitable than the control (no guidance from solvers), and the language generated results in higher rewards, in all negotiation domains. We discuss future implications of this work, and how game-theoretic solvers that can leverage the expressivity of natural language can open up a new avenue of guiding language research.
Paper Structure (47 sections, 6 figures, 8 tables, 5 algorithms)

This paper contains 47 sections, 6 figures, 8 tables, 5 algorithms.

Figures (6)

  • Figure 1: Figure shows an overview of our framework: on the left, an example dialogue game tree in a meeting scheduling domain, and on the right, an LLM prompted with equilibrium solver decisions. Squares denote decision points where the solver chooses actions, circles denote chance nodes where LLM responses are stochastically generated, and diamonds denote leaves (terminal states of the tree). Values below the diamonds correspond to the payoffs for player 1 and player 2, respectively.
  • Figure 2: Illustration of the three dialogue domains we consider: an email scheduling task on the left, a debate task in the centre, and a fruit trading task on the right.
  • Figure 3: Progression of PSRO on the fruit trading domain. PSRO begins with the first four candidate prompts ("calm"—"any"). The equilibrium over these prompts is diplayed along with each subsequent equilbrium over the growing candidate set. Recall, each new candidate action was an approximate best response at $t=1$ to the previous candidate set (e.g., "angry" was a best response to the equilibrium over "calm"—"any") at $t=0$ shown in white.
  • Figure 4: The same PSRO run as Figure \ref{['fig:psro:fruit_trading']} but reporting the Nash bargaining solution in red at each iteration for the fruit trading domain.
  • Figure 5: Progression of PSRO on the meeting scheduling domain. PSRO begins with the first four candidate prompts ("calm"—"any"). The equilibrium over these prompts is displayed along with each subsequent equilbrium over the growing candidate set. Recall, each new candidate action was an approximate best response at $t=1$ to the previous candidate set (e.g., "angry" was a best response to the equilibrium over "calm"—"any") at $t=0$ shown in white.
  • ...and 1 more figures