Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

Trung Hieu Ngo; Adrien Bazoge; Solen Quiniou; Pierre-Antoine Gourraud; Emmanuel Morin

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

Trung Hieu Ngo, Adrien Bazoge, Solen Quiniou, Pierre-Antoine Gourraud, Emmanuel Morin

TL;DR

It is found that embedded stereotypes can be probed using SDoH input and that LLMs rely on embedded stereotypes to make gendered decisions, suggesting that evaluating interactions among SDoH factors could usefully complement existing approaches to assessing LLM performance and bias.

Abstract

Large Language Models (LLMs) excel in Natural Language Processing (NLP) tasks, but they often propagate biases embedded in their training data, which is potentially impactful in sensitive domains like healthcare. While existing benchmarks evaluate biases related to individual social determinants of health (SDoH) such as gender or ethnicity, they often overlook interactions between these factors and lack context-specific assessments. This study investigates bias in LLMs by probing the relationships between gender and other SDoH in French patient records. Through a series of experiments, we found that embedded stereotypes can be probed using SDoH input and that LLMs rely on embedded stereotypes to make gendered decisions, suggesting that evaluating interactions among SDoH factors could usefully complement existing approaches to assessing LLM performance and bias.

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

TL;DR

Abstract

Paper Structure (24 sections, 11 figures, 4 tables)

This paper contains 24 sections, 11 figures, 4 tables.

Introduction
Methodology
Task Motivation and Definition
Data
Dataset
Gender Neutralization
Models
Model choice
Models prompting
Evaluation and Analysis Methods
Gender Bias Measurement
Association between Gendered Predictions and SDoH
Experiments and Results
Gender Stereotype Evaluation
Interpreting the prediction tendencies
...and 9 more sections

Figures (11)

Figure 1: A sample of input data override in a diagnosis task. Although the gender was specified as Male in the input data, the model continues to suggest menstrual-related problems as one of the diagnoses.
Figure 2: Averaged number of occurrences for each predicted class across 3 runs for each model. Error bars indicate Standard Deviation values.
Figure 3: Modified RMSE scores for 9 models, denoting the deviation from the neutral value of 4. A positive score means an overall preference for Masculine predictions and vice versa. Larger absolute values show higher bias degrees.
Figure 4: Heatmap of associations between SDoH options and the Male (left) and Female (right) predictions for nine models. Odds ratio values are reported in the cells. Color intensity indicates probability. Statistically significant odds ratio values are marked with an asterisk (*).
Figure 5: Heatmap of associations between Profession groups and the Male (left) and Female (right) predictions for nine models. Odds ratio values are reported in the cells. Color intensity indicates probability. Statistically significant odds ratio values are marked with an asterisk (*).
...and 6 more figures

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

TL;DR

Abstract

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

Authors

TL;DR

Abstract

Table of Contents

Figures (11)