Table of Contents
Fetching ...

Understanding Intrinsic Socioeconomic Biases in Large Language Models

Mina Arzaghi, Florian Carichon, Golnoosh Farnadi

TL;DR

This paper investigates intrinsic socioeconomic biases in large language models by introducing a novel 1,000,000-sentence masked-token dataset and evaluating four models (Falcon, Llama 2, GPT-2, BERT) across birth-gender, marital status, race, and religion, including intersectional combinations. It defines three metrics—Language Model Coherence Score $\text{LMCS}$, Poverty Association Ratio $\text{PAR}$, and EquiLexi Score $\text{ELS}$—to quantify bias and linguistic integrity, and uses a neutral baseline framework to contextualize results. The study finds that autoregressive models exhibit stronger socioeconomic biases than BERT, intersectionality amplifies bias, and names can reveal gender and race information that correlates with biased predictions. These findings underscore the urgency of developing robust, multi-dimensional bias mitigation techniques before deploying LLMs in high-stakes settings such as loans, visas, and insurance, and highlight data-source quality and model design as key drivers of fairness outcomes.

Abstract

Large Language Models (LLMs) are increasingly integrated into critical decision-making processes, such as loan approvals and visa applications, where inherent biases can lead to discriminatory outcomes. In this paper, we examine the nuanced relationship between demographic attributes and socioeconomic biases in LLMs, a crucial yet understudied area of fairness in LLMs. We introduce a novel dataset of one million English sentences to systematically quantify socioeconomic biases across various demographic groups. Our findings reveal pervasive socioeconomic biases in both established models such as GPT-2 and state-of-the-art models like Llama 2 and Falcon. We demonstrate that these biases are significantly amplified when considering intersectionality, with LLMs exhibiting a remarkable capacity to extract multiple demographic attributes from names and then correlate them with specific socioeconomic biases. This research highlights the urgent necessity for proactive and robust bias mitigation techniques to safeguard against discriminatory outcomes when deploying these powerful models in critical real-world applications.

Understanding Intrinsic Socioeconomic Biases in Large Language Models

TL;DR

This paper investigates intrinsic socioeconomic biases in large language models by introducing a novel 1,000,000-sentence masked-token dataset and evaluating four models (Falcon, Llama 2, GPT-2, BERT) across birth-gender, marital status, race, and religion, including intersectional combinations. It defines three metrics—Language Model Coherence Score , Poverty Association Ratio , and EquiLexi Score —to quantify bias and linguistic integrity, and uses a neutral baseline framework to contextualize results. The study finds that autoregressive models exhibit stronger socioeconomic biases than BERT, intersectionality amplifies bias, and names can reveal gender and race information that correlates with biased predictions. These findings underscore the urgency of developing robust, multi-dimensional bias mitigation techniques before deploying LLMs in high-stakes settings such as loans, visas, and insurance, and highlight data-source quality and model design as key drivers of fairness outcomes.

Abstract

Large Language Models (LLMs) are increasingly integrated into critical decision-making processes, such as loan approvals and visa applications, where inherent biases can lead to discriminatory outcomes. In this paper, we examine the nuanced relationship between demographic attributes and socioeconomic biases in LLMs, a crucial yet understudied area of fairness in LLMs. We introduce a novel dataset of one million English sentences to systematically quantify socioeconomic biases across various demographic groups. Our findings reveal pervasive socioeconomic biases in both established models such as GPT-2 and state-of-the-art models like Llama 2 and Falcon. We demonstrate that these biases are significantly amplified when considering intersectionality, with LLMs exhibiting a remarkable capacity to extract multiple demographic attributes from names and then correlate them with specific socioeconomic biases. This research highlights the urgent necessity for proactive and robust bias mitigation techniques to safeguard against discriminatory outcomes when deploying these powerful models in critical real-world applications.
Paper Structure (33 sections, 3 equations, 14 figures, 11 tables)

This paper contains 33 sections, 3 equations, 14 figures, 11 tables.

Figures (14)

  • Figure 1: Pairwise PAR Comparison of Gender Terms Across Models. Female terms consistently exhibit higher PAR scores than male terms. For example, in the Falcon model, the PAR gap between 'woman' and 'man' is approximately 0.34, indicating a 34% higher likelihood of associating the term 'woman' with poverty-related terms. Conversely, 'man' registers a lower PAR than even the Neutral Level, suggesting a bias towards a different socioeconomic class.
  • Figure 2: Comparison PAR for Marital Status Across Language Models. For Falcon and Llama 2, a significant gap is observed between Married and other marital statuses, while for GPT-2 and BERT, the levels are comparatively uniform.
  • Figure 3: Comparison of Race PAR across LLMs: In Falcon and Llama 2, 'Indigenous' term is highly associated with poverty, followed by 'Black' and 'Latino', while 'Whites' exhibit the lowest PAR, indicating an association with wealth. 'Multi-Ethnic' term in Falcon and 'Asian' in Llama 2 are the races with the lowest bias as thier PAR is close to the Neutral Level. In GPT-2 'Mixed-race' and 'White' have the highest and lowest PAR, respectively. GPT-2 shows a socioeconomic bias towards 'Mixed-race', while there is no evidence of bias towards 'White', as its PAR is near the Neutral Level. The differences are not as pronounced as those in Falcon and Llama 2. In BERT, no socioeconomic bias is observed, as all races have PARs around the Neutral Level.
  • Figure 4: Comparison of PAR for Religion across LLMs: In Falcon and Llama 2, 'Muslim' are highly associated with poverty, while 'Jewish' term exhibit the lowest PAR, indicating an association with wealth. 'Hindu' demonstrate the least socioeconomic bias as its PAR is close to Neutral Level. In GPT-2, 'Jewish' and 'Christian' show minimal socioeconomic bias close to Neutral Level. However, 'Hindu' have the highest PAR, indicating bias but lesser compared to other autoregressive language models. In BERT, no socioeconomic bias is observed as all religions have PARs around the Neutral Level. For clarity, some of the terms have been omitted from this plot. A complete version is presented in Appendix 2.
  • Figure 5: This heatmap shows the intersectionality impact of race and gender on PAR in Llama 2. It compares composite PAR values with individual PAR for each domain. The values inside the heatmap displays PAR of intersectionality, with values at the top and right side showing individual PAR of gender and race.
  • ...and 9 more figures