Implicit Bias in LLMs for Transgender Populations

Micaela Hirsch; Marina Elichiry; Blas Radi; Tamara Quiroga; David Restrepo; Luciana Benotti; Veronica Xhardez; Jocelyn Dunstan; Enzo Ferrante

Implicit Bias in LLMs for Transgender Populations

Micaela Hirsch, Marina Elichiry, Blas Radi, Tamara Quiroga, David Restrepo, Luciana Benotti, Veronica Xhardez, Jocelyn Dunstan, Enzo Ferrante

Abstract

Large language models (LLMs) have been shown to exhibit biases against LGBTQ+ populations. While safety training may lessen explicit expressions of bias, previous work has shown that implicit stereotype-driven associations often persist. In this work, we examine implicit bias toward transgender people in two main scenarios. First, we adapt word association tests to measure whether LLMs disproportionately pair negative concepts with "transgender" and positive concepts with "cisgender". Second, acknowledging the well-documented systemic challenges that transgender people encounter in real-world healthcare settings, we examine implicit biases that may emerge when LLMs are applied to healthcare decision-making. To this end, we design a healthcare appointment allocation task where models act as scheduling agents choosing between cisgender and transgender candidates across medical specialties prone to stereotyping. We evaluate seven LLMs in English and Spanish. Our results show consistent bias in categories such as appearance, risk, and veracity, indicating stronger negative associations with transgender individuals. In the allocation task, transgender candidates are favored for STI and mental health services, while cisgender candidates are preferred in gynecology and breast care. These findings underscore the need for research that address subtle stereotype-driven biases in LLMs to ensure equitable treatment of transgender people in healthcare applications.

Implicit Bias in LLMs for Transgender Populations

Abstract

Paper Structure (15 sections, 1 equation, 6 figures, 3 tables)

This paper contains 15 sections, 1 equation, 6 figures, 3 tables.

Introduction
Surfacing implicit biases via word association tests
Implicit biases in health-related resource allocation
Conclusions
Appendix
Model versions
Prompts for word association
Prompts for association
Word list
Prejudice list
Prompts for resource allocation
Demographic information of the personas used for the allocation experiment
Model explanations in resource allocation
List of symptoms for the resource allocation experiment
Additional results for the resource allocation experiment (with and without symptoms)

Figures (6)

Figure 1: Bias score per category for seven different models (GPT 4o mini, GPT 4o, Gemini 2.0 Flash, Gemini 2.0 Flash Lite, Grok 3, Grok 3 mini and Llama 3 70B) in both English and Spanish. Error bars are the 95% bootstrapped confidence intervals ($B{=}2000$).
Figure 2: Selection rates for the cisgender and transgender patients for models GPT 4o mini, GPT 4o, Grok 3 mini, Grok 3, Gemini 2.0 Flash, Gemini 2.0 Flash Lite and Llama 3 70B , when only demographic information is provided.
Figure Appendix 1: Age distribution of cisgender and transgender profiles.
Figure Appendix 2: Selection rates for the cisgender and transgender patients for models GPT 4o mini, GPT 4o, Grok 3 mini, Grok 3, Gemini 2.0 Flash, Gemini 2.0 Flash Lite and Llama 3 70B in Spanish, when only demographic information is provided.
Figure Appendix 3: Selection rates for the cisgender and transgender patients for models GPT 4o mini, GPT 4o, Grok 3 mini, Grok 3, Gemini 2.0 Flash, Gemini 2.0 Flash Lite and Llama 3 70B when provided symptoms of similar urgency in Spanish.
...and 1 more figures

Implicit Bias in LLMs for Transgender Populations

Abstract

Implicit Bias in LLMs for Transgender Populations

Authors

Abstract

Table of Contents

Figures (6)