Table of Contents
Fetching ...

Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance

Pedro Henrique Luz de Araujo, Paul Röttger, Dirk Hovy, Benjamin Roth

TL;DR

This paper formalizes a normative framework for persona prompting in LLMs, defining three desiderata—Expertise Advantage, Robustness to irrelevant attributes, and Fidelity to relevant attributes—and develops metrics to evaluate them. It benchmarks nine open-weight LLMs across 27 tasks, revealing that expert personas frequently help or are neutral, but irrelevant attributes frequently hurt performance, even for large models. The authors propose mitigation strategies and show mixed efficacy, with robustness improvements mainly appearing in the largest models and fidelity sometimes deteriorating due to anchoring effects. The work highlights the need for careful persona design and evaluation schemes that align with intended effects, enabling more principled and reliable use of persona prompting in practice.

Abstract

Expert persona prompting -- assigning roles such as expert in math to language models -- is widely used for task improvement. However, prior work shows mixed results on its effectiveness, and does not consider when and why personas should improve performance. We analyze the literature on persona prompting for task improvement and distill three desiderata: 1) performance advantage of expert personas, 2) robustness to irrelevant persona attributes, and 3) fidelity to persona attributes. We then evaluate 9 state-of-the-art LLMs across 27 tasks with respect to these desiderata. We find that expert personas usually lead to positive or non-significant performance changes. Surprisingly, models are highly sensitive to irrelevant persona details, with performance drops of almost 30 percentage points. In terms of fidelity, we find that while higher education, specialization, and domain-relatedness can boost performance, their effects are often inconsistent or negligible across tasks. We propose mitigation strategies to improve robustness -- but find they only work for the largest, most capable models. Our findings underscore the need for more careful persona design and for evaluation schemes that reflect the intended effects of persona usage.

Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance

TL;DR

This paper formalizes a normative framework for persona prompting in LLMs, defining three desiderata—Expertise Advantage, Robustness to irrelevant attributes, and Fidelity to relevant attributes—and develops metrics to evaluate them. It benchmarks nine open-weight LLMs across 27 tasks, revealing that expert personas frequently help or are neutral, but irrelevant attributes frequently hurt performance, even for large models. The authors propose mitigation strategies and show mixed efficacy, with robustness improvements mainly appearing in the largest models and fidelity sometimes deteriorating due to anchoring effects. The work highlights the need for careful persona design and evaluation schemes that align with intended effects, enabling more principled and reliable use of persona prompting in practice.

Abstract

Expert persona prompting -- assigning roles such as expert in math to language models -- is widely used for task improvement. However, prior work shows mixed results on its effectiveness, and does not consider when and why personas should improve performance. We analyze the literature on persona prompting for task improvement and distill three desiderata: 1) performance advantage of expert personas, 2) robustness to irrelevant persona attributes, and 3) fidelity to persona attributes. We then evaluate 9 state-of-the-art LLMs across 27 tasks with respect to these desiderata. We find that expert personas usually lead to positive or non-significant performance changes. Surprisingly, models are highly sensitive to irrelevant persona details, with performance drops of almost 30 percentage points. In terms of fidelity, we find that while higher education, specialization, and domain-relatedness can boost performance, their effects are often inconsistent or negligible across tasks. We propose mitigation strategies to improve robustness -- but find they only work for the largest, most capable models. Our findings underscore the need for more careful persona design and for evaluation schemes that reflect the intended effects of persona usage.

Paper Structure

This paper contains 36 sections, 29 figures, 3 tables.

Figures (29)

  • Figure 1: We define three desiderata for persona prompting: Task experts should perform on par or better than the no-persona model (Expertise Advantage); Irrelevant attributes such as names should not influence model performance (Robustness); relevant attributes such as domain expertise should shape performance accordingly (Fidelity).
  • Figure 2: Expertise Advantage. Number of tasks (Table \ref{['tab:datasets']}) in which the Expertise Advantage metric was positive, negative, or not significant. In-bar annotations indicate the percentage of tasks in each category. Models often fulfill the Expertise Advantage desideratum, though there are also negatively impacted tasks.
  • Figure 3: Robustness. Number of tasks (Table \ref{['tab:datasets']}) in which the Robustness metric was positive, negative, or not significant. In-bar annotations indicate the percentage of tasks in each category. Irrelevant personas often have a negative effect on performance in all models.
  • Figure 4: Fidelity. Number of tasks (Table \ref{['tab:datasets']}) in which the Fidelity metric (with respect to education level, domain match, and expertise specialization) was positive, negative, or not significant. In-bar annotations indicate the percentage of tasks in each category. Models are often faithful to education level and domain match expectations, whereas Fidelity to specialization level is less frequent.
  • Figure 5: Persona effect on model performance. Error bars show the 95% confidence interval. The effects shown are the fixed effect coefficients of the trained mixed effects model. Positive coefficients correspond to improvements over the no-persona baseline.
  • ...and 24 more figures