Table of Contents
Fetching ...

Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data

Atmika Gorti, Manas Gaur, Aman Chadha

TL;DR

This work tackles occupation-related bias in LLMs by grounding bias assessment in authoritative NBLS labor data and introducing a bias-out-of-the-box framework. It evaluates seven LLMs using zero-shot and few-shot prompting across a 2,500-sample, multi-task dataset, revealing substantial cross-model variation in NBLS alignment. A simple NBLS-grounded debiasing method via prompting—leveraging 32 contextual NBLS examples—achieves an average, substantial reduction in bias, with model-specific outcomes highlighted by per-model analyses and bias scores. The results demonstrate the value of grounding debiasing in real-world labor statistics to improve fairness, while also underscoring the need for careful evaluation, transparency, and data sharing to support ongoing, ethical AI development.

Abstract

Large Language Models (LLMs) are prone to inheriting and amplifying societal biases embedded within their training data, potentially reinforcing harmful stereotypes related to gender, occupation, and other sensitive categories. This issue becomes particularly problematic as biased LLMs can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities across various domains, such as recruitment, online content moderation, or even the criminal justice system. Although prior research has focused on detecting bias in LLMs using specialized datasets designed to highlight intrinsic biases, there has been a notable lack of investigation into how these findings correlate with authoritative datasets, such as those from the U.S. National Bureau of Labor Statistics (NBLS). To address this gap, we conduct empirical research that evaluates LLMs in a ``bias-out-of-the-box" setting, analyzing how the generated outputs compare with the distributions found in NBLS data. Furthermore, we propose a straightforward yet effective debiasing mechanism that directly incorporates NBLS instances to mitigate bias within LLMs. Our study spans seven different LLMs, including instructable, base, and mixture-of-expert models, and reveals significant levels of bias that are often overlooked by existing bias detection techniques. Importantly, our debiasing method, which does not rely on external datasets, demonstrates a substantial reduction in bias scores, highlighting the efficacy of our approach in creating fairer and more reliable LLMs.

Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data

TL;DR

This work tackles occupation-related bias in LLMs by grounding bias assessment in authoritative NBLS labor data and introducing a bias-out-of-the-box framework. It evaluates seven LLMs using zero-shot and few-shot prompting across a 2,500-sample, multi-task dataset, revealing substantial cross-model variation in NBLS alignment. A simple NBLS-grounded debiasing method via prompting—leveraging 32 contextual NBLS examples—achieves an average, substantial reduction in bias, with model-specific outcomes highlighted by per-model analyses and bias scores. The results demonstrate the value of grounding debiasing in real-world labor statistics to improve fairness, while also underscoring the need for careful evaluation, transparency, and data sharing to support ongoing, ethical AI development.

Abstract

Large Language Models (LLMs) are prone to inheriting and amplifying societal biases embedded within their training data, potentially reinforcing harmful stereotypes related to gender, occupation, and other sensitive categories. This issue becomes particularly problematic as biased LLMs can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities across various domains, such as recruitment, online content moderation, or even the criminal justice system. Although prior research has focused on detecting bias in LLMs using specialized datasets designed to highlight intrinsic biases, there has been a notable lack of investigation into how these findings correlate with authoritative datasets, such as those from the U.S. National Bureau of Labor Statistics (NBLS). To address this gap, we conduct empirical research that evaluates LLMs in a ``bias-out-of-the-box" setting, analyzing how the generated outputs compare with the distributions found in NBLS data. Furthermore, we propose a straightforward yet effective debiasing mechanism that directly incorporates NBLS instances to mitigate bias within LLMs. Our study spans seven different LLMs, including instructable, base, and mixture-of-expert models, and reveals significant levels of bias that are often overlooked by existing bias detection techniques. Importantly, our debiasing method, which does not rely on external datasets, demonstrates a substantial reduction in bias scores, highlighting the efficacy of our approach in creating fairer and more reliable LLMs.
Paper Structure (17 sections, 1 equation, 5 figures, 16 tables)

This paper contains 17 sections, 1 equation, 5 figures, 16 tables.

Figures (5)

  • Figure 1: Grounded Bias Estimation Workflow: We initially grouped the categories of ethnicity, religion, and gender. We then tested each of the seven LLMs, analyzed the frequency of proper responses, and 4 models were compared the data to the U.S. NBLS, and 3 were analyzed for debiasing.
  • Figure 2: The regression line for Falcon represents the predictable distribution of occupations with national data, while the scatter points represent the occupations given by the Falcon model. The graph shows a large distance between the regression line and points, depicting an inaccurate representation of predicted occupations by the model.
  • Figure 3: Compared to Figure \ref{['fig:falcon-plot']}, GPT Neo’s graph shows a close relation between the data points and the regression line, indicating the model’s accurate representation of occupations based on national data. Some occupations are close to the regression line, showing accuracy, while others are spaced out, indicating persistent bias.
  • Figure 4: Compared to Figure \ref{['fig:falcon-plot']} and \ref{['fig:neo-plot']}, the graph for Gemini 1.5 shows some occupations close to the regression line, indicating accuracy, while others are spaced out, indicating minimal bias. Although the occupations are relatively close to the regression line, the variation of the occupations is low. The model may be using a reduced variety of jobs as a mechanism to avoid bias.
  • Figure 5: Compared to Figures \ref{['fig:falcon-plot']}, \ref{['fig:neo-plot']}, and \ref{['fig:gemini-plot']}, the graph for GPT-4o shows multiple occupations accurately depicted by the minimal distance between data points and the regression line, though some occupations are still far, indicating some bias. The occupation of an influencer is peculiar since it recently has become an official job and due to models being trained on recent data, this is a possibility of the extremely accurate representation the model has produced.