Table of Contents
Fetching ...

Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude

Wentao Xu, Yile Yan, Yuqi Zhu

TL;DR

The paper investigates biases in LLM ethical decision-making by evaluating nine models across four dilemma types (protective vs. harmful) and single versus intersectional protected attributes, totaling 50,400 trials. Using a multi-metric framework (normalized frequency, preference priority, sensitivity, stability, and clustering) and a detailed experimental setup, the authors reveal systematic protected-attribute biases that vary by model type (open-source vs. closed-source) and dilemma context. Open-source models show stronger biases for several attributes and higher sensitivity in harmful scenarios, while closed-source models are more selective in protective contexts but favor different attributes in harmful ones; intersectional inputs further amplify biases. The findings argue for comprehensive, context-aware fairness evaluations and transparent auditing to guide responsible deployment of LLMs in ethically salient applications. The work provides a structured methodology and empirical evidence to support governance, bias mitigation, and ongoing auditing in AI decision-making systems.

Abstract

Recent advances in Large Language Models (LLMs) have enabled human-like responses across various tasks, raising questions about their ethical decision-making capabilities and potential biases. This study systematically evaluates how nine popular LLMs (both open-source and closed-source) respond to ethical dilemmas involving protected attributes. Across 50,400 trials spanning single and intersectional attribute combinations in four dilemma scenarios (protective vs. harmful), we assess models' ethical preferences, sensitivity, stability, and clustering patterns. Results reveal significant biases in protected attributes in all models, with differing preferences depending on model type and dilemma context. Notably, open-source LLMs show stronger preferences for marginalized groups and greater sensitivity in harmful scenarios, while closed-source models are more selective in protective situations and tend to favor mainstream groups. We also find that ethical behavior varies across dilemma types: LLMs maintain consistent patterns in protective scenarios but respond with more diverse and cognitively demanding decisions in harmful ones. Furthermore, models display more pronounced ethical tendencies under intersectional conditions than in single-attribute settings, suggesting that complex inputs reveal deeper biases. These findings highlight the need for multi-dimensional, context-aware evaluation of LLMs' ethical behavior and offer a systematic evaluation and approach to understanding and addressing fairness in LLM decision-making.

Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude

TL;DR

The paper investigates biases in LLM ethical decision-making by evaluating nine models across four dilemma types (protective vs. harmful) and single versus intersectional protected attributes, totaling 50,400 trials. Using a multi-metric framework (normalized frequency, preference priority, sensitivity, stability, and clustering) and a detailed experimental setup, the authors reveal systematic protected-attribute biases that vary by model type (open-source vs. closed-source) and dilemma context. Open-source models show stronger biases for several attributes and higher sensitivity in harmful scenarios, while closed-source models are more selective in protective contexts but favor different attributes in harmful ones; intersectional inputs further amplify biases. The findings argue for comprehensive, context-aware fairness evaluations and transparent auditing to guide responsible deployment of LLMs in ethically salient applications. The work provides a structured methodology and empirical evidence to support governance, bias mitigation, and ongoing auditing in AI decision-making systems.

Abstract

Recent advances in Large Language Models (LLMs) have enabled human-like responses across various tasks, raising questions about their ethical decision-making capabilities and potential biases. This study systematically evaluates how nine popular LLMs (both open-source and closed-source) respond to ethical dilemmas involving protected attributes. Across 50,400 trials spanning single and intersectional attribute combinations in four dilemma scenarios (protective vs. harmful), we assess models' ethical preferences, sensitivity, stability, and clustering patterns. Results reveal significant biases in protected attributes in all models, with differing preferences depending on model type and dilemma context. Notably, open-source LLMs show stronger preferences for marginalized groups and greater sensitivity in harmful scenarios, while closed-source models are more selective in protective situations and tend to favor mainstream groups. We also find that ethical behavior varies across dilemma types: LLMs maintain consistent patterns in protective scenarios but respond with more diverse and cognitively demanding decisions in harmful ones. Furthermore, models display more pronounced ethical tendencies under intersectional conditions than in single-attribute settings, suggesting that complex inputs reveal deeper biases. These findings highlight the need for multi-dimensional, context-aware evaluation of LLMs' ethical behavior and offer a systematic evaluation and approach to understanding and addressing fairness in LLM decision-making.
Paper Structure (22 sections, 11 equations, 12 figures, 2 tables)

This paper contains 22 sections, 11 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Single attribute frequency heat map
  • Figure 2: Intersectional attribute frequency heat map
  • Figure 4: Sensitivity for single scenarios
  • Figure 5: Sensitivity for intersectional scenarios
  • Figure 7: Standard deviation heat map for single scenarios
  • ...and 7 more figures