Table of Contents
Fetching ...

Gender Bias in LLM-generated Interview Responses

Haein Kong, Yongsu Ahn, Sangyub Lee, Yunho Maeng

TL;DR

This work investigates gender bias in LLM generated interview responses across multiple models, question types, and job categories. It uses LIWC based psycholinguistic features and nonparametric testing to show biases that align with male agentic and female communal stereotypes, with stronger effects in male-dominant jobs. The findings reveal consistent model and question level bias, plus a clear link between bias magnitude and job dominance, raising concerns about potential reinforcement of gender stereotypes in hiring contexts. The study highlights the need for bias mitigation and careful validation of AI assisted interview tools in high stakes domains.

Abstract

LLMs have emerged as a promising tool for assisting individuals in diverse text-generation tasks, including job-related texts. However, LLM-generated answers have been increasingly found to exhibit gender bias. This study evaluates three LLMs (GPT-3.5, GPT-4, Claude) to conduct a multifaceted audit of LLM-generated interview responses across models, question types, and jobs, and their alignment with two gender stereotypes. Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.

Gender Bias in LLM-generated Interview Responses

TL;DR

This work investigates gender bias in LLM generated interview responses across multiple models, question types, and job categories. It uses LIWC based psycholinguistic features and nonparametric testing to show biases that align with male agentic and female communal stereotypes, with stronger effects in male-dominant jobs. The findings reveal consistent model and question level bias, plus a clear link between bias magnitude and job dominance, raising concerns about potential reinforcement of gender stereotypes in hiring contexts. The study highlights the need for bias mitigation and careful validation of AI assisted interview tools in high stakes domains.

Abstract

LLMs have emerged as a promising tool for assisting individuals in diverse text-generation tasks, including job-related texts. However, LLM-generated answers have been increasingly found to exhibit gender bias. This study evaluates three LLMs (GPT-3.5, GPT-4, Claude) to conduct a multifaceted audit of LLM-generated interview responses across models, question types, and jobs, and their alignment with two gender stereotypes. Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.

Paper Structure

This paper contains 13 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Bias significance and intensity over LLMs and interview question types. For LLM-generated interview responses, (a) linguistic and psychological properties of applicants significantly biased towards either males or females are consistent across models and questions, but (b) the bias intensity based on LIWC score ratio is much higher in male applicants.
  • Figure 2: Bias quantity at job level.
  • Figure 3: Bias ratio and conformity to gender stereotypes at job category level.
  • Figure 4: Stereotypical persona for job categories. The averaged properties of applicants in male- and female-dominant job categories highly conform to dimensions related to male agentic and female communal stereotypes respectively.