A Systematic Analysis of Biases in Large Language Models
Xulang Zhang, Rui Mao, Erik Cambria
TL;DR
This work systematically probes biases in four widely used LLMs across politics, ideology, alliance, language, and gender using targeted, multilingual, and cross-domain tasks. By combining neutralization prompts, stance classification, UN voting simulations, multilingual story prompts, and World Values Survey analogs, it reveals persistent or model-specific biases despite efforts toward neutrality. The findings highlight complex, domain-dependent tendencies—such as subtle political leanings, ideological cue sensitivity, and gender-value alignment—that have implications for safe and fair deployment. The paper argues for pluralistic, culturally aware alignment strategies and cautions against assuming universal neutrality in AI systems that learn from human data.
Abstract
Large language models (LLMs) have rapidly become indispensable tools for acquiring information and supporting human decision-making. However, ensuring that these models uphold fairness across varied contexts is critical to their safe and responsible deployment. In this study, we undertake a comprehensive examination of four widely adopted LLMs, probing their underlying biases and inclinations across the dimensions of politics, ideology, alliance, language, and gender. Through a series of carefully designed experiments, we investigate their political neutrality using news summarization, ideological biases through news stance classification, tendencies toward specific geopolitical alliances via United Nations voting patterns, language bias in the context of multilingual story completion, and gender-related affinities as revealed by responses to the World Values Survey. Results indicate that while the LLMs are aligned to be neutral and impartial, they still show biases and affinities of different types.
