Two-Faced Social Agents: Context Collapse in Role-Conditioned Large Language Models
Vikram K Suresh
TL;DR
The paper investigates whether frontier LLMs can sustain authentic, role-conditioned socioeconomic personas when performing cognitively demanding tasks. By testing GPT-5, Claude Sonnet 4.5, and Gemini 2.5 Flash across 15 SES personas and three scenarios on SAT mathematics and affective preferences, it reveals a robust context collapse: GPT-5 shows complete collapse, Gemini 2.5 Flash shows partial collapse, and Claude Sonnet 4.5 retains limited SES-based variation, including an inverted SES-performance pattern under replication. However, when cognitive constraints are relaxed (affective tasks), socio-affective variation reemerges, indicating a two-faced behavior where demographic signals persist in some domains but vanish in others. The study links these failures to optimization-driven convergence and alignment tradeoffs, with significant implications for using LLMs in realistic social simulations and survey research, and it proposes detection strategies based on cognitive load, response patterns, timing, and linguistic fingerprints. Overall, the work argues that effective realistic social simulations require embedding contextual priors in post-training alignment, not just distributional calibration, to maintain authentic role-conditioned reasoning under varied tasks.
Abstract
In this study, we evaluate the persona fidelity of frontier LLMs, GPT-5, Claude Sonnet 4.5 and Gemini 2.5 Flash when assigned distinct socioeconomic personas performing scholastic assessment test (SAT) mathematics items and affective preference tasks. Across 15 distinct role conditions and three testing scenarios, GPT-5 exhibited complete contextual collapse and adopted a singular identity towards optimal responses (PERMANOVA p=1.000, R^2=0.0004), while Gemini 2.5 Flash showed partial collapse (p=0.120, R^2=0.0020). Claude Sonnet 4.5 retained limited but measurable role-specific variation on the SAT items (PERMANOVA p<0.001, R^2=0.0043), though with inverted SES-performance relationships where low-SES personas outperformed high-SES personas (eta^2 = 0.15-0.19 in extended replication). However, all models exhibited distinct role-conditioned affective preference (average d = 0.52-0.58 vs near zero separation for math), indicating that socio-affective variation can reemerge when cognitive constraints are relaxed. These findings suggest that distributional fidelity failure originates in task-dependent contextual collapse: optimization-driven identity convergence under cognitive load combined with impaired role-contextual understanding. Realistic social simulations may require embedding contextual priors in the model's post-training alignment and not just distributional calibration to replicate human-like responses. Beyond simulation validity, these results have implications for survey data integrity, as LLMs can express plausible demographic variation on preference items while failing to maintain authentic reasoning constraints.
