Table of Contents
Fetching ...

Communication Enhances LLMs' Stability in Strategic Thinking

Nunzio Lore, Babak Heydari

TL;DR

This work tackles context-driven instability in strategic reasoning by LLMs in multi-agent settings. It tests cheap-talk pre-play messages in a ten-round IPD across four open-weight models (7–9B) and multiple framings, using $LOWESS$ smoothing to quantify trajectory stability via $RMSE$ and nonparametric bootstrap inference. Results show that cheap-talk generally reduces trajectory noise, with larger gains for more volatile models and robustness to prompt variants, though some contexts yield exceptions such as semantic conflicts under certain framings or network constraints. The findings suggest cheap-talk as a practical, low-cost mechanism to stabilize multi-agent LLM behavior, enabling more reliable and controllable strategic coordination in distributed AI systems.

Abstract

Large Language Models (LLMs) often exhibit pronounced context-dependent variability that undermines predictable multi-agent behavior in tasks requiring strategic thinking. Focusing on models that range from 7 to 9 billion parameters in size engaged in a ten-round repeated Prisoner's Dilemma, we evaluate whether short, costless pre-play messages emulating the cheap-talk paradigm affect strategic stability. Our analysis uses simulation-level bootstrap resampling and nonparametric inference to compare cooperation trajectories fitted with LOWESS regression across both the messaging and the no-messaging condition. We demonstrate consistent reductions in trajectory noise across a majority of the model-context pairings being studied. The stabilizing effect persists across multiple prompt variants and decoding regimes, though its magnitude depends on model choice and contextual framing, with models displaying higher baseline volatility gaining the most. While communication rarely produces harmful instability, we document a few context-specific exceptions and identify the limited domains in which communication harms stability. These findings position cheap-talk style communication as a low-cost, practical tool for improving the predictability and reliability of strategic behavior in multi-agent LLM systems.

Communication Enhances LLMs' Stability in Strategic Thinking

TL;DR

This work tackles context-driven instability in strategic reasoning by LLMs in multi-agent settings. It tests cheap-talk pre-play messages in a ten-round IPD across four open-weight models (7–9B) and multiple framings, using smoothing to quantify trajectory stability via and nonparametric bootstrap inference. Results show that cheap-talk generally reduces trajectory noise, with larger gains for more volatile models and robustness to prompt variants, though some contexts yield exceptions such as semantic conflicts under certain framings or network constraints. The findings suggest cheap-talk as a practical, low-cost mechanism to stabilize multi-agent LLM behavior, enabling more reliable and controllable strategic coordination in distributed AI systems.

Abstract

Large Language Models (LLMs) often exhibit pronounced context-dependent variability that undermines predictable multi-agent behavior in tasks requiring strategic thinking. Focusing on models that range from 7 to 9 billion parameters in size engaged in a ten-round repeated Prisoner's Dilemma, we evaluate whether short, costless pre-play messages emulating the cheap-talk paradigm affect strategic stability. Our analysis uses simulation-level bootstrap resampling and nonparametric inference to compare cooperation trajectories fitted with LOWESS regression across both the messaging and the no-messaging condition. We demonstrate consistent reductions in trajectory noise across a majority of the model-context pairings being studied. The stabilizing effect persists across multiple prompt variants and decoding regimes, though its magnitude depends on model choice and contextual framing, with models displaying higher baseline volatility gaining the most. While communication rarely produces harmful instability, we document a few context-specific exceptions and identify the limited domains in which communication harms stability. These findings position cheap-talk style communication as a low-cost, practical tool for improving the predictability and reliability of strategic behavior in multi-agent LLM systems.
Paper Structure (19 sections, 1 figure, 6 tables)

This paper contains 19 sections, 1 figure, 6 tables.

Figures (1)

  • Figure 1: Overview of the general framework of this paper. Agents are presented with the payoff structure of the Prisoner's Dilemma and informed that they will play the game for multiple rounds, without ever specifying the time horizon. The prompt frames the interaction according to one of 5 + 1 social contexts, one of which (neutral) is used as a baseline. In the messaging treatment, agents can exchange a short one-sentence message before selecting their action each round. The cooperation trajectory for each model-context pair is then separately analyzed for each treatment using a bootstrapped LOWESS regression and calculating the RMSE of the fit. We study the average difference in RMSE between the no-messaging vs. messaging treatment, construct the empirical 95% confidence intervals for the difference, and assess statistical significance using the exclusion principle.