What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation
Weiwen Su, Yuhan Zhou, Zihan Wang, Naoki Yoshinaga, Masashi Toyoda
TL;DR
This work addresses the problem that user simulations may be unfaithful when relevant personas are missing. It introduces the PICQ dataset, a context-aware set of Persona-Influenced Choice Questions with annotated unknown personas, and a multi-task prompting framework to identify these personas. Through a diverse LLM benchmark, it reveals a fidelity–influence trade-off that scales with model size and cognitive economy, suggesting that no single model fully captures unknown personas. The authors propose a synergistic Generate-Complete-Validate workflow combining multiple models and human input to produce more faithful simulations, while noting language and data-domain limitations and outlining directions for real-world validation.
Abstract
Existing user simulations, where models generate user-like responses in dialogue, often lack verification that sufficient user personas are provided, questioning the validity of the simulations. To address this core concern, this work explores the task of identifying relevant but unknown personas of the simulation target for a given simulation context. We introduce PICQ, a novel dataset of context-aware choice questions, annotated with unknown personas (e.g., ''Is the user price-sensitive?'') that may influence user choices, and propose a multi-faceted evaluation scheme assessing fidelity, influence, and inaccessibility. Our benchmark of leading LLMs reveals a complex ''Fidelity vs. Insight'' dilemma governed by model scale: while influence generally scales with model size, fidelity to human patterns follows an inverted U-shaped curve. We trace this phenomenon to cognitive differences, particularly the human tendency for ''cognitive economy.'' Our work provides the first comprehensive benchmark for this crucial task, offering a new lens for understanding the divergent cognitive models of humans and advanced LLMs.
