Echoing: Identity Failures when LLM Agents Talk to Each Other
Sarath Shekkizhar, Romain Cosentino, Adam Earle, Silvio Savarese
TL;DR
This work reveals a distinct failure mode in agent–agent (AxA) interactions, termed echoing, where an agent abandons its assigned identity and mirrors its partner’s role. By formalizing AxA as a partially observable stochastic game and introducing EchoEvalLM to detect identity inconsistency, the authors conduct a large-scale study across 60 configurations, 3 domains, and 2000+ conversations, showing echoing rates from $5\%$ to $70\%$ that persist even in reasoning-enabled models. Prompt engineering modestly reduces but does not erase echoing, and standard completion metrics largely mask these identity drifts. A protocol-level mitigation using structured responses significantly lowers echoing to $<10\%$, illustrating near-term practical safeguards, yet the persistence of drift implies deeper architectural or training changes are needed. Overall, the results underline that AxA reliability cannot be inferred from single-agent performance and motivate new evaluation frameworks and safeguards tailored to multi-agent conversational systems.
Abstract
As large language model (LLM) based agents interact autonomously with one another, a new class of failures emerges that cannot be predicted from single agent performance: behavioral drifts in agent-agent conversations (AxA). Unlike human-agent interactions, where humans ground and steer conversations, AxA lacks such stabilizing signals, making these failures unique. We investigate one such failure, echoing, where agents abandon their assigned roles and instead mirror their conversational partners, undermining their intended objectives. Through experiments across $60$ AxA configurations, $3$ domains, and $2000+$ conversations, we demonstrate that echoing occurs across three major LLM providers, with echoing rates from $5\%$ to $70\%$ depending on the model and domain. Moreover, we find that echoing is persistent even in advanced reasoning models with substantial rates ($32.8\%$) that are not reduced by increased reasoning efforts. We analyze prompt impacts, conversation dynamics, showing that echoing arises as interaction grows longer ($7+$ turns in experiments) and is not merely an artifact of sub-optimal prompting. Finally, we introduce a protocol-level mitigation in which targeted use of structured responses reduces echoing to $9\%$.
