Table of Contents
Fetching ...

What's in a Name? Auditing Large Language Models for Race and Gender Bias

Alejandro Salinas, Amit Haim, Julian Nyarko

TL;DR

The paper audits large language models for race and gender bias by prompting a named individual across 14 domains with 42 templates, using 40 names and race/gender pairings to measure continuous, numeric outcomes. It demonstrates a systemic bias where names associated with Black women fare worst, while White male-associated names fare best, with biases persisting across models including GPT-4 and PaLM-2. A numeric anchor in prompts effectively counteracts most disparities, whereas qualitative context has inconsistent effects and can amplify gaps. The authors argue for routine, deployment-time bias audits to mitigate potential harms, emphasizing actionable mitigation and policy relevance for AI deployment in high-stakes domains.

Abstract

We employ an audit design to investigate biases in state-of-the-art large language models, including GPT-4. In our study, we prompt the models for advice involving a named individual across a variety of scenarios, such as during car purchase negotiations or election outcome predictions. We find that the advice systematically disadvantages names that are commonly associated with racial minorities and women. Names associated with Black women receive the least advantageous outcomes. The biases are consistent across 42 prompt templates and several models, indicating a systemic issue rather than isolated incidents. While providing numerical, decision-relevant anchors in the prompt can successfully counteract the biases, qualitative details have inconsistent effects and may even increase disparities. Our findings underscore the importance of conducting audits at the point of LLM deployment and implementation to mitigate their potential for harm against marginalized communities.

What's in a Name? Auditing Large Language Models for Race and Gender Bias

TL;DR

The paper audits large language models for race and gender bias by prompting a named individual across 14 domains with 42 templates, using 40 names and race/gender pairings to measure continuous, numeric outcomes. It demonstrates a systemic bias where names associated with Black women fare worst, while White male-associated names fare best, with biases persisting across models including GPT-4 and PaLM-2. A numeric anchor in prompts effectively counteracts most disparities, whereas qualitative context has inconsistent effects and can amplify gaps. The authors argue for routine, deployment-time bias audits to mitigate potential harms, emphasizing actionable mitigation and policy relevance for AI deployment in high-stakes domains.

Abstract

We employ an audit design to investigate biases in state-of-the-art large language models, including GPT-4. In our study, we prompt the models for advice involving a named individual across a variety of scenarios, such as during car purchase negotiations or election outcome predictions. We find that the advice systematically disadvantages names that are commonly associated with racial minorities and women. Names associated with Black women receive the least advantageous outcomes. The biases are consistent across 42 prompt templates and several models, indicating a systemic issue rather than isolated incidents. While providing numerical, decision-relevant anchors in the prompt can successfully counteract the biases, qualitative details have inconsistent effects and may even increase disparities. Our findings underscore the importance of conducting audits at the point of LLM deployment and implementation to mitigate their potential for harm against marginalized communities.
Paper Structure (22 sections, 16 figures, 33 tables)

This paper contains 22 sections, 16 figures, 33 tables.

Figures (16)

  • Figure 1: Example of prompt with reference to dimensions.
  • Figure 2: Results for Purchase Scenario (GPT-4.0)
  • Figure 3: Aggregated Mean Differences across Race and Gender (GPT 4.0)
  • Figure 4: Standardized Means for all Names.
  • Figure 5: PaLM-2 results for Purchase Scenario.
  • ...and 11 more figures