Irrelevant Alternatives Bias Large Language Model Hiring Decisions
Kremena Valkanova, Pencho Yordanov
TL;DR
This work investigates whether large language models exhibit the attraction or decoy effect in AI-assisted hiring. Using a minimal, classical two-attribute design across six occupations, the authors prompt GPT-3.5 and GPT-4 in a recruiter role and compare target vs competitor choices with and without a decoy. They show consistent evidence of the attraction effect, with the decoy increasing the target's selection probability, and find that irrelevant attributes such as gender amplify the bias, especially for GPT-4 which also exhibits greater variance. Robustness checks with warnings and varied recruiter roles do not systematically remove the bias, highlighting the need for careful mitigation when deploying LLMs in high-stakes recruitment tasks and raising ethical considerations about context effects in AI-assisted decision making.
Abstract
We investigate whether LLMs display a well-known human cognitive bias, the attraction effect, in hiring decisions. The attraction effect occurs when the presence of an inferior candidate makes a superior candidate more appealing, increasing the likelihood of the superior candidate being chosen over a non-dominated competitor. Our study finds consistent and significant evidence of the attraction effect in GPT-3.5 and GPT-4 when they assume the role of a recruiter. Irrelevant attributes of the decoy, such as its gender, further amplify the observed bias. GPT-4 exhibits greater bias variation than GPT-3.5. Our findings remain robust even when warnings against the decoy effect are included and the recruiter role definition is varied.
