Table of Contents
Fetching ...

Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation

Razan Baltaji, Babak Hemmatian, Lav R. Varshney

TL;DR

This work probes how multi-agent LLMs maintain assigned national personas and contribute to culturally diverse group decisions. Using a three-stage framework (onboarding, debate/collaboration, reflection) with entropy-based diversity controls, the authors quantify conformity, impersonation, and confabulation across debate and collaboration modes. They find that diversity encourages broader perspectives but is undermined by conformity pressures and occasional persona drift, with debate instructions sometimes increasing inconstancy. The results emphasize the need to diagnose and mitigate sources of persona instability to unlock the full potential of multi-agent simulations for scientific, diplomatic, and policy-relevant tasks.

Abstract

Multi-agent AI systems can be used for simulating collective decision-making in scientific and practical applications. They can also be used to introduce a diverse group discussion step in chatbot pipelines, enhancing the cultural sensitivity of the chatbot's responses. These applications, however, are predicated on the ability of AI agents to reliably adopt assigned personas and mimic human interactions. To see whether LLM agents satisfy these requirements, we examine AI agent ensembles engaged in cross-national collaboration and debate by analyzing their private responses and chat transcripts. Our findings suggest that multi-agent discussions can support collective AI decisions that more often reflect diverse perspectives, yet this effect is tempered by the agents' susceptibility to conformity due to perceived peer pressure and occasional challenges in maintaining consistent personas and opinions. Instructions that encourage debate in support of one's opinions rather than collaboration increase the rate of inconstancy. Without addressing the factors we identify, the full potential of multi-agent frameworks for producing more culturally diverse AI outputs or more realistic simulations of group decision-making may remain untapped.

Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation

TL;DR

This work probes how multi-agent LLMs maintain assigned national personas and contribute to culturally diverse group decisions. Using a three-stage framework (onboarding, debate/collaboration, reflection) with entropy-based diversity controls, the authors quantify conformity, impersonation, and confabulation across debate and collaboration modes. They find that diversity encourages broader perspectives but is undermined by conformity pressures and occasional persona drift, with debate instructions sometimes increasing inconstancy. The results emphasize the need to diagnose and mitigate sources of persona instability to unlock the full potential of multi-agent simulations for scientific, diplomatic, and policy-relevant tasks.

Abstract

Multi-agent AI systems can be used for simulating collective decision-making in scientific and practical applications. They can also be used to introduce a diverse group discussion step in chatbot pipelines, enhancing the cultural sensitivity of the chatbot's responses. These applications, however, are predicated on the ability of AI agents to reliably adopt assigned personas and mimic human interactions. To see whether LLM agents satisfy these requirements, we examine AI agent ensembles engaged in cross-national collaboration and debate by analyzing their private responses and chat transcripts. Our findings suggest that multi-agent discussions can support collective AI decisions that more often reflect diverse perspectives, yet this effect is tempered by the agents' susceptibility to conformity due to perceived peer pressure and occasional challenges in maintaining consistent personas and opinions. Instructions that encourage debate in support of one's opinions rather than collaboration increase the rate of inconstancy. Without addressing the factors we identify, the full potential of multi-agent frameworks for producing more culturally diverse AI outputs or more realistic simulations of group decision-making may remain untapped.
Paper Structure (14 sections, 8 figures, 3 tables)

This paper contains 14 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: An illustration of our experimental setup for a debate: a) Onboarding stage where agents are asked to report their opinions independently, b) Debate stage where agents participate in a debate moderated by a chat manager, c) Reflection stage where agents are asked to report their opinions independently based on the previous discussion. A similar setup is used for collaboration.
  • Figure 2: Group Prediction follows the distribution of opinions during onboarding across different onboarding entropy groups for debate while also generating new ideas particularly at the group of highest diversity. Groups are less likely to predict opinions with higher probability for debate compared to collaboration.
  • Figure 3: Initiators Dominate Group Prediction: agents follow the initiator opinion of a debate and often converge to the opinion of the initiator $I$. Initiators have less impact on a group's response $G$ during debates compared to collaborations.
  • Figure 4: The changes in initiator opinion from Onboarding to the onset of a debate can be predicted from the Onboarding entropy of the group opinions. Initiators are more likely to change their opinion as diversity of the group increases despite not having observed the opinions of other agents yet. Initiators of a debate change their opinion during debates less often than in collaborations, highlighting the importance of prompt engineering for inducing persona constancy.
  • Figure 5: Group Prediction in a collaboration follows opinions with higher probabilities across different onboarding entropy groups. Groups are more likely to predict opinions with higher probability for collaboration compared to debate. Generation of new ideas occurs at different entropies particularly at the group of highest diversity.
  • ...and 3 more figures