Evaluating LLM Adaptation to Sociodemographic Factors: User Profile vs. Dialogue History
Qishuai Zhong, Zongmin Li, Siqi Fan, Aixin Sun
TL;DR
This work addresses how LLMs adapt outputs to users' sociodemographic contexts when attributes are provided explicitly in a prompt or implicitly via dialogue history. It introduces a two-format evaluation framework and an agent-based synthetic dataset aligned with profiles, using Hofstede’s Value Survey Module (VSM 2013) to probe value expression, quantified with $JSD$ across demographic groups and $EMD$ for cross-format consistency. The study evaluates multiple open-source LLMs, including reasoning-augmented models, finding that most models adjust expressed values with demographic changes—especially age and education—with larger, reasoning-enabled models showing stronger cross-format consistency, notably the QwQ-32B model. The results underscore the importance of reasoning capabilities in achieving robust sociodemographic adaptation and provide a privacy-preserving benchmark by releasing the synthetic dataset for future research. Overall, the framework offers a rigorous, controllable approach to assess cross-format adaptation relevant for real-world chatbot deployments.
Abstract
Effective engagement by large language models (LLMs) requires adapting responses to users' sociodemographic characteristics, such as age, occupation, and education level. While many real-world applications leverage dialogue history for contextualization, existing evaluations of LLMs' behavioral adaptation often focus on single-turn prompts. In this paper, we propose a framework to evaluate LLM adaptation when attributes are introduced either (1) explicitly via user profiles in the prompt or (2) implicitly through multi-turn dialogue history. We assess the consistency of model behavior across these modalities. Using a multi-agent pipeline, we construct a synthetic dataset pairing dialogue histories with distinct user profiles and employ questions from the Value Survey Module (VSM 2013) (Hofstede and Hofstede, 2016) to probe value expression. Our findings indicate that most models adjust their expressed values in response to demographic changes, particularly in age and education level, but consistency varies. Models with stronger reasoning capabilities demonstrate greater alignment, indicating the importance of reasoning in robust sociodemographic adaptation.
