Table of Contents
Fetching ...

Two-Faced Social Agents: Context Collapse in Role-Conditioned Large Language Models

Vikram K Suresh

TL;DR

The paper investigates whether frontier LLMs can sustain authentic, role-conditioned socioeconomic personas when performing cognitively demanding tasks. By testing GPT-5, Claude Sonnet 4.5, and Gemini 2.5 Flash across 15 SES personas and three scenarios on SAT mathematics and affective preferences, it reveals a robust context collapse: GPT-5 shows complete collapse, Gemini 2.5 Flash shows partial collapse, and Claude Sonnet 4.5 retains limited SES-based variation, including an inverted SES-performance pattern under replication. However, when cognitive constraints are relaxed (affective tasks), socio-affective variation reemerges, indicating a two-faced behavior where demographic signals persist in some domains but vanish in others. The study links these failures to optimization-driven convergence and alignment tradeoffs, with significant implications for using LLMs in realistic social simulations and survey research, and it proposes detection strategies based on cognitive load, response patterns, timing, and linguistic fingerprints. Overall, the work argues that effective realistic social simulations require embedding contextual priors in post-training alignment, not just distributional calibration, to maintain authentic role-conditioned reasoning under varied tasks.

Abstract

In this study, we evaluate the persona fidelity of frontier LLMs, GPT-5, Claude Sonnet 4.5 and Gemini 2.5 Flash when assigned distinct socioeconomic personas performing scholastic assessment test (SAT) mathematics items and affective preference tasks. Across 15 distinct role conditions and three testing scenarios, GPT-5 exhibited complete contextual collapse and adopted a singular identity towards optimal responses (PERMANOVA p=1.000, R^2=0.0004), while Gemini 2.5 Flash showed partial collapse (p=0.120, R^2=0.0020). Claude Sonnet 4.5 retained limited but measurable role-specific variation on the SAT items (PERMANOVA p<0.001, R^2=0.0043), though with inverted SES-performance relationships where low-SES personas outperformed high-SES personas (eta^2 = 0.15-0.19 in extended replication). However, all models exhibited distinct role-conditioned affective preference (average d = 0.52-0.58 vs near zero separation for math), indicating that socio-affective variation can reemerge when cognitive constraints are relaxed. These findings suggest that distributional fidelity failure originates in task-dependent contextual collapse: optimization-driven identity convergence under cognitive load combined with impaired role-contextual understanding. Realistic social simulations may require embedding contextual priors in the model's post-training alignment and not just distributional calibration to replicate human-like responses. Beyond simulation validity, these results have implications for survey data integrity, as LLMs can express plausible demographic variation on preference items while failing to maintain authentic reasoning constraints.

Two-Faced Social Agents: Context Collapse in Role-Conditioned Large Language Models

TL;DR

The paper investigates whether frontier LLMs can sustain authentic, role-conditioned socioeconomic personas when performing cognitively demanding tasks. By testing GPT-5, Claude Sonnet 4.5, and Gemini 2.5 Flash across 15 SES personas and three scenarios on SAT mathematics and affective preferences, it reveals a robust context collapse: GPT-5 shows complete collapse, Gemini 2.5 Flash shows partial collapse, and Claude Sonnet 4.5 retains limited SES-based variation, including an inverted SES-performance pattern under replication. However, when cognitive constraints are relaxed (affective tasks), socio-affective variation reemerges, indicating a two-faced behavior where demographic signals persist in some domains but vanish in others. The study links these failures to optimization-driven convergence and alignment tradeoffs, with significant implications for using LLMs in realistic social simulations and survey research, and it proposes detection strategies based on cognitive load, response patterns, timing, and linguistic fingerprints. Overall, the work argues that effective realistic social simulations require embedding contextual priors in post-training alignment, not just distributional calibration, to maintain authentic role-conditioned reasoning under varied tasks.

Abstract

In this study, we evaluate the persona fidelity of frontier LLMs, GPT-5, Claude Sonnet 4.5 and Gemini 2.5 Flash when assigned distinct socioeconomic personas performing scholastic assessment test (SAT) mathematics items and affective preference tasks. Across 15 distinct role conditions and three testing scenarios, GPT-5 exhibited complete contextual collapse and adopted a singular identity towards optimal responses (PERMANOVA p=1.000, R^2=0.0004), while Gemini 2.5 Flash showed partial collapse (p=0.120, R^2=0.0020). Claude Sonnet 4.5 retained limited but measurable role-specific variation on the SAT items (PERMANOVA p<0.001, R^2=0.0043), though with inverted SES-performance relationships where low-SES personas outperformed high-SES personas (eta^2 = 0.15-0.19 in extended replication). However, all models exhibited distinct role-conditioned affective preference (average d = 0.52-0.58 vs near zero separation for math), indicating that socio-affective variation can reemerge when cognitive constraints are relaxed. These findings suggest that distributional fidelity failure originates in task-dependent contextual collapse: optimization-driven identity convergence under cognitive load combined with impaired role-contextual understanding. Realistic social simulations may require embedding contextual priors in the model's post-training alignment and not just distributional calibration to replicate human-like responses. Beyond simulation validity, these results have implications for survey data integrity, as LLMs can express plausible demographic variation on preference items while failing to maintain authentic reasoning constraints.

Paper Structure

This paper contains 20 sections, 4 equations, 15 figures, 8 tables.

Figures (15)

  • Figure 1: SAT mathematics accuracy across socioeconomic personas and testing scenarios for each model. (a) GPT-5 exhibited complete contextual collapse with uniform accuracy across SES groups and scenarios. (b) Gemini 2.5 Flash also collapses under all scenarios and SES groups. (c) Claude Sonnet 4.5 retained measurable SES-based accuracy differences across all scenarios prompting extended validation.
  • Figure 2: Preference task SES analysis across 16 economic items and three models. (a) Effect sizes for ordinal and categorical preference dimensions. (b) Corresponding $p$-value heatmap showing the robustness and direction of SES associations.
  • Figure 3: t-SNE projections of reasoning embeddings from correct SAT solutions. Claude Sonnet and Gemini 2.5 exhibit slight SES-structured drift, while GPT-5 shows complete suppression of SES structure.
  • Figure 4: Reasoning quality and linguistic structure in Claude compared to other models.
  • Figure 5: Alignment between human SAT SES patterns and AI model SES patterns. Scatter points show Low-SES and High-SES performance for humans and each model. A negative slope for Claude Sonnet indicates that the model does not preserve the human-like direction where High-SES students outperform Low-SES students. Whereas GPT-5 and Gemini 2.5 Flash exhibit complete suppression of SES differences. The human accuracy is estimated using College Board reported data CollegeBoard2007 (see Section \ref{['sec4.3.4']}). A negative slope for Claude Sonnet indicates inverted alignment (Low-SES personas outperform High-SES personas). GPT-5 and Gemini 2.5 Flash exhibit complete suppression of SES differences (overlapping points at 100% accuracy), yielding undefined correlation due to zero variance in model SES-based performance.
  • ...and 10 more figures