The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models
Konrad Löhr, Shuzhou Yuan, Michael Färber
TL;DR
This study systematically measures political biases in eight large language models using the two-dimensional Political Compass Test (PCT), comparing explicit stereotypes elicited via persona prompting with implicit stereotypes revealed through multilingual prompting. It introduces an annotation-driven baseline bias framework with a $P_{agree,m,d}$ scoring rule and computes model biases, revealing a consistent left-leaning orientation across models. The key finding is that language-driven implicit stereotypes are often stronger than explicit prompts, though most models show alignment between the two, suggesting some internal bias coherence. The work underscores the importance of cross-linguistic analyses when assessing LLM biases and provides a methodology to expand bias evaluation beyond English to capture latent, language-dependent effects with practical implications for safe and fair deployment.
Abstract
Large Language Models (LLMs) are increasingly integral to information dissemination and decision-making processes. Given their growing societal influence, understanding potential biases, particularly within the political domain, is crucial to prevent undue influence on public opinion and democratic processes. This work investigates political bias and stereotype propagation across eight prominent LLMs using the two-dimensional Political Compass Test (PCT). Initially, the PCT is employed to assess the inherent political leanings of these models. Subsequently, persona prompting with the PCT is used to explore explicit stereotypes across various social dimensions. In a final step, implicit stereotypes are uncovered by evaluating models with multilingual versions of the PCT. Key findings reveal a consistent left-leaning political alignment across all investigated models. Furthermore, while the nature and extent of stereotypes vary considerably between models, implicit stereotypes elicited through language variation are more pronounced than those identified via explicit persona prompting. Interestingly, for most models, implicit and explicit stereotypes show a notable alignment, suggesting a degree of transparency or "awareness" regarding their inherent biases. This study underscores the complex interplay of political bias and stereotypes in LLMs.
