Table of Contents
Fetching ...

From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness

Linbo Cao, Lihao Sun, Yang Yue

TL;DR

This study shows that demographic-based persona conditioning can meaningfully distort LLM agent behavior, undermining task performance across diverse domains. By evaluating three widely used LLMs on five agentic benchmarks with 23 personas spanning gender, race/origin, religion, and profession, the authors demonstrate degradations up to 26.2% and reveal that biases propagate from language to action. The findings highlight an overlooked vulnerability in LLM agents that can compromise safety and reliability in real-world deployments, underscoring the need for debiasing and robustness interventions. The work provides a foundation for understanding how societal stereotypes can seep into autonomous decision-making and offers directions for designing more stable, fair, and accountable agentic systems.

Abstract

Large Language Models (LLMs) are increasingly deployed as autonomous agents capable of actions with real-world impacts beyond text generation. While persona-induced biases in text generation are well documented, their effects on agent task performance remain largely unexplored, even though such effects pose more direct operational risks. In this work, we present the first systematic case study showing that demographic-based persona assignments can alter LLM agents' behavior and degrade performance across diverse domains. Evaluating widely deployed models on agentic benchmarks spanning strategic reasoning, planning, and technical operations, we uncover substantial performance variations - up to 26.2% degradation, driven by task-irrelevant persona cues. These shifts appear across task types and model architectures, indicating that persona conditioning and simple prompt injections can distort an agent's decision-making reliability. Our findings reveal an overlooked vulnerability in current LLM agentic systems: persona assignments can introduce implicit biases and increase behavioral volatility, raising concerns for the safe and robust deployment of LLM agents.

From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness

TL;DR

This study shows that demographic-based persona conditioning can meaningfully distort LLM agent behavior, undermining task performance across diverse domains. By evaluating three widely used LLMs on five agentic benchmarks with 23 personas spanning gender, race/origin, religion, and profession, the authors demonstrate degradations up to 26.2% and reveal that biases propagate from language to action. The findings highlight an overlooked vulnerability in LLM agents that can compromise safety and reliability in real-world deployments, underscoring the need for debiasing and robustness interventions. The work provides a foundation for understanding how societal stereotypes can seep into autonomous decision-making and offers directions for designing more stable, fair, and accountable agentic systems.

Abstract

Large Language Models (LLMs) are increasingly deployed as autonomous agents capable of actions with real-world impacts beyond text generation. While persona-induced biases in text generation are well documented, their effects on agent task performance remain largely unexplored, even though such effects pose more direct operational risks. In this work, we present the first systematic case study showing that demographic-based persona assignments can alter LLM agents' behavior and degrade performance across diverse domains. Evaluating widely deployed models on agentic benchmarks spanning strategic reasoning, planning, and technical operations, we uncover substantial performance variations - up to 26.2% degradation, driven by task-irrelevant persona cues. These shifts appear across task types and model architectures, indicating that persona conditioning and simple prompt injections can distort an agent's decision-making reliability. Our findings reveal an overlooked vulnerability in current LLM agentic systems: persona assignments can introduce implicit biases and increase behavioral volatility, raising concerns for the safe and robust deployment of LLM agents.
Paper Structure (15 sections, 4 figures, 2 tables)

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Demographic-based persona assignments can unintentionally compromise LLM agent robustness, revealing how task-unrelated persona cues induce implicit biases and trigger undesired performance variations.
  • Figure 2: Gender persona effects on GPT-4o-mini. All scores are normalized to the model’s baseline. Dashed levels correspond to baseline performance.
  • Figure 3: Professional persona effects on ALFWorld success rates. Professions that are stereotypically viewed as of higher status generally improve performance, while working-class roles tend to decrease it across models.
  • Figure 4: Religious persona effects on DeepSeek V3’s Card Game accuracy. Christian and Buddhist personas lead to large performance drops, while Jewish and Chinese Traditional personas show above-baseline performance.