A Matter of Perspective(s): Contrasting Human and LLM Argumentation in Subjective Decision-Making on Subtle Sexism
Paula Akemi Aoyagui, Kelsey Stemmler, Sharon Ferguson, Young-ho Kim, Anastasia Kuzminykh
TL;DR
The study analyzes how humans and multiple LLMs differ in perspective-taking during subjective, non-ground-truth decisions about subtle sexism. By collecting a scenarios dataset and outputs from GPT-3, GPT-3.5, GPT-4, Llama 3.1, and human respondents, it defines Stance (sexist, not sexist, depends, no stance) and Perspectives (victim, perpetrator, decision-maker) to quantify reasoning patterns. It finds that all agents use the same taxonomy but at different frequencies and combinations; newer models exhibit richer, multi-perspective explanations and safety guardrails. The work argues for evaluating LLMs on perspective diversity and complementarity with human judgment, with implications for bias measurement and human-AI collaboration in value-based decision-making.
Abstract
In subjective decision-making, where decisions are based on contextual interpretation, Large Language Models (LLMs) can be integrated to present users with additional rationales to consider. The diversity of these rationales is mediated by the ability to consider the perspectives of different social actors. However, it remains unclear whether and how models differ in the distribution of perspectives they provide. We compare the perspectives taken by humans and different LLMs when assessing subtle sexism scenarios. We show that these perspectives can be classified within a finite set (perpetrator, victim, decision-maker), consistently present in argumentations produced by humans and LLMs, but in different distributions and combinations, demonstrating differences and similarities with human responses, and between models. We argue for the need to systematically evaluate LLMs' perspective-taking to identify the most suitable models for a given decision-making task. We discuss the implications for model evaluation.
