Table of Contents
Fetching ...

A database to support the evaluation of gender biases in GPT-4o output

Luise Mehner, Lena Alicija Philine Fiedler, Sabine Ammon, Dorothea Kolossa

TL;DR

This paper tackles the problem of evaluating gender biases in GPT-4o outputs by proposing a standpoint-theory–driven database construction that makes normative assumptions explicit. It introduces a comprehensive prompt-generation pipeline, encompassing open-ended, explicit, and implicit bias assessments, and collects a large, reproducible dataset across pre-test and main-test phases. The main contributions are the normative framework, the multi-method bias evaluation prompts, and the publicly released GPT-4o bias dataset to foster reproducibility and critical discourse. The approach aims to provide a more nuanced, accountable framework for LLM fairness research, foregrounding marginalized perspectives and situated knowledge in the evaluation process.

Abstract

The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.

A database to support the evaluation of gender biases in GPT-4o output

TL;DR

This paper tackles the problem of evaluating gender biases in GPT-4o outputs by proposing a standpoint-theory–driven database construction that makes normative assumptions explicit. It introduces a comprehensive prompt-generation pipeline, encompassing open-ended, explicit, and implicit bias assessments, and collects a large, reproducible dataset across pre-test and main-test phases. The main contributions are the normative framework, the multi-method bias evaluation prompts, and the publicly released GPT-4o bias dataset to foster reproducibility and critical discourse. The approach aims to provide a more nuanced, accountable framework for LLM fairness research, foregrounding marginalized perspectives and situated knowledge in the evaluation process.

Abstract

The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.

Paper Structure

This paper contains 11 sections, 3 tables.