Communication Enhances LLMs' Stability in Strategic Thinking
Nunzio Lore, Babak Heydari
TL;DR
This work tackles context-driven instability in strategic reasoning by LLMs in multi-agent settings. It tests cheap-talk pre-play messages in a ten-round IPD across four open-weight models (7–9B) and multiple framings, using $LOWESS$ smoothing to quantify trajectory stability via $RMSE$ and nonparametric bootstrap inference. Results show that cheap-talk generally reduces trajectory noise, with larger gains for more volatile models and robustness to prompt variants, though some contexts yield exceptions such as semantic conflicts under certain framings or network constraints. The findings suggest cheap-talk as a practical, low-cost mechanism to stabilize multi-agent LLM behavior, enabling more reliable and controllable strategic coordination in distributed AI systems.
Abstract
Large Language Models (LLMs) often exhibit pronounced context-dependent variability that undermines predictable multi-agent behavior in tasks requiring strategic thinking. Focusing on models that range from 7 to 9 billion parameters in size engaged in a ten-round repeated Prisoner's Dilemma, we evaluate whether short, costless pre-play messages emulating the cheap-talk paradigm affect strategic stability. Our analysis uses simulation-level bootstrap resampling and nonparametric inference to compare cooperation trajectories fitted with LOWESS regression across both the messaging and the no-messaging condition. We demonstrate consistent reductions in trajectory noise across a majority of the model-context pairings being studied. The stabilizing effect persists across multiple prompt variants and decoding regimes, though its magnitude depends on model choice and contextual framing, with models displaying higher baseline volatility gaining the most. While communication rarely produces harmful instability, we document a few context-specific exceptions and identify the limited domains in which communication harms stability. These findings position cheap-talk style communication as a low-cost, practical tool for improving the predictability and reliability of strategic behavior in multi-agent LLM systems.
