Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

Jessica Y. Bo; Lillio Mok; Ashton Anderson

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

Jessica Y. Bo, Lillio Mok, Ashton Anderson

TL;DR

Results suggest that LLMs may encode inconsistent biases towards humans and algorithms, which need to be carefully considered when they are deployed in high-stakes scenarios, and the sensitivity of LLMs to task presentation formats that should be broadly scrutinized in evaluation robustness for AI safety.

Abstract

Large language models are increasingly used in decision-making tasks that require them to process information from a variety of sources, including both human experts and other algorithmic agents. How do LLMs weigh the information provided by these different sources? We consider the well-studied phenomenon of algorithm aversion, in which human decision-makers exhibit bias against predictions from algorithms. Drawing upon experimental paradigms from behavioural economics, we evaluate how eightdifferent LLMs delegate decision-making tasks when the delegatee is framed as a human expert or an algorithmic agent. To be inclusive of different evaluation formats, we conduct our study with two task presentations: stated preferences, modeled through direct queries about trust towards either agent, and revealed preferences, modeled through providing in-context examples of the performance of both agents. When prompted to rate the trustworthiness of human experts and algorithms across diverse tasks, LLMs give higher ratings to the human expert, which correlates with prior results from human respondents. However, when shown the performance of a human expert and an algorithm and asked to place an incentivized bet between the two, LLMs disproportionately choose the algorithm, even when it performs demonstrably worse. These discrepant results suggest that LLMs may encode inconsistent biases towards humans and algorithms, which need to be carefully considered when they are deployed in high-stakes scenarios. Furthermore, we discuss the sensitivity of LLMs to task presentation formats that should be broadly scrutinized in evaluation robustness for AI safety.

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

TL;DR

Abstract

Paper Structure (19 sections, 11 figures, 4 tables)

This paper contains 19 sections, 11 figures, 4 tables.

Introduction
Related Work
Methods
Study 1: Asking Direct Queries (Stated)
Study 2: Providing In-Context Information (Revealed)
Results
Study 1: Direct Queries Invoke Algorithm Aversion (Stated)
Study 2: In-Context Information Invoke Algorithm Appreciation (Revealed)
Stated-Revealed Comparison (RQ3)
Updated Experiments with Newer LLMs
Discussion
Limitations
Conclusion
Study 1 Prompting
Study 2 Prompting
...and 4 more sections

Figures (11)

Figure 1: 'Stated' algorithm aversion across all models and tasks, operationalized as the gap between the trust rating given to the human expert and the algorithm.
Figure 2: Aggregate probabilities that LLMs in Study 2 demonstrate in delegating an algorithmic agent or a human expert to make the next prediction in the task, or neutral (no indicated preference).
Figure 3: Probability that each LLM correctly bets on the stronger predictor, disaggregated by task and whether the stronger predictor is presented as a human expert or an algorithm.
Figure 4: Probability of choosing the human expert over the algorithm in Studies 1 and 2, demonstrating the stated-revealed trust inconsistency. Error bars are SEM.
Figure D.1: Correlation between the LLMs trust gaps.
...and 6 more figures

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

TL;DR

Abstract

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

Authors

TL;DR

Abstract

Table of Contents

Figures (11)