Probing Social Identity Bias in Chinese LLMs with Gendered Pronouns and Social Groups
Geng Liu, Feng Li, Junjie Mu, Mengxiao Zhu, Francesco Pierri
TL;DR
This study systematically probes social identity biases in Chinese language prompts and interactions by examining ingroup ('We') versus outgroup ('They') framings across ten LLMs and 240 Chinese social groups, complemented by analysis of WildChat conversation data. It introduces a Mandarin-specific evaluation framework employing gendered pronouns (他们/她们) and a mix of controlled prompts and naturalistic dialogues, analyzed with sentiment labeling and logistic regression to quantify ingroup solidarity and outgroup hostility. Results show consistent ingroup positivity and outgroup hostility across model types, with stronger effects in pretrained models and notable gender asymmetries—female outgroups often provoking stronger negativity—though instruction-tuned models tend to be more balanced. In naturalistic dialogue, biases intensify, especially in assistant responses, highlighting risks for deployed, user-facing Chinese NLP systems and signaling a need for culturally aware assessment and mitigation strategies tailored to Chinese sociolinguistic contexts.
Abstract
Large language models (LLMs) are increasingly deployed in user-facing applications, raising concerns about their potential to reflect and amplify social biases. We investigate social identity framing in Chinese LLMs using Mandarin-specific prompts across ten representative Chinese LLMs, evaluating responses to ingroup ("We") and outgroup ("They") framings, and extending the setting to 240 social groups salient in the Chinese context. To complement controlled experiments, we further analyze Chinese-language conversations from a corpus of real interactions between users and chatbots. Across models, we observe systematic ingroup-positive and outgroup-negative tendencies, which are not confined to synthetic prompts but also appear in naturalistic dialogue, indicating that bias dynamics might strengthen in real interactions. Our study provides a language-aware evaluation framework for Chinese LLMs, demonstrating that social identity biases documented in English generalize cross-linguistically and intensify in user-facing contexts.
