Table of Contents
Fetching ...

Human-AI Collaboration with Misaligned Preferences

Jiaxin Song, Parnian Shahkar, Kate Donahue, Bhaskar Ray Chaudhury

TL;DR

This work models human-AI collaboration as a curator algorithm selecting a top-$k$ subset from $m$ items, with humans who have noisy and potentially misaligned preferences. It analyzes when misalignment with the algorithm can improve individual and population welfare using Mallows and Plackett-Luce permutation models, and provides both theoretical results and empirical validation. A key finding is that misalignment can be beneficial for some users and settings, while utilitarian welfare maximization is generally NP-hard but tractable via a mixed-integer program, with uplift capable under certain conditions and small amounts of noise. The study highlights a practical tension between maximizing overall welfare and ensuring uplift for all user types, and discusses design and policy implications including when to allow, or even encourage, stochasticity and misalignment in algorithmic tools. These insights inform how to design algorithmic curation tools and policies that balance personalization with fairness across heterogeneous user populations.

Abstract

In many real-life settings, algorithms play the role of assistants, while humans ultimately make the final decision. Often, algorithms specifically act as curators, narrowing down a wide range of options into a smaller subset that the human picks between: consider content recommendation or chatbot responses to questions with multiple valid answers. Crucially, humans may not know their own preferences perfectly either, but instead may only have access to a noisy sampling over preferences. Algorithms can assist humans by curating a smaller subset of items, but must also face the challenge of misalignment: humans may have different preferences from each other (and from the algorithm), and the algorithm may not know the exact preferences of the human they are facing at any point in time. In this paper, we model and theoretically study such a setting. Specifically, we show instances where humans benefit by collaborating with a misaligned algorithm. Surprisingly, we show that humans gain more utility from a misaligned algorithm (which makes different mistakes) than from an aligned algorithm. Next, we build on this result by studying what properties of algorithms maximize human welfare when the goals could be either utilitarian welfare or ensuring all humans benefit. We conclude by discussing implications for designers of algorithmic tools and policymakers.

Human-AI Collaboration with Misaligned Preferences

TL;DR

This work models human-AI collaboration as a curator algorithm selecting a top- subset from items, with humans who have noisy and potentially misaligned preferences. It analyzes when misalignment with the algorithm can improve individual and population welfare using Mallows and Plackett-Luce permutation models, and provides both theoretical results and empirical validation. A key finding is that misalignment can be beneficial for some users and settings, while utilitarian welfare maximization is generally NP-hard but tractable via a mixed-integer program, with uplift capable under certain conditions and small amounts of noise. The study highlights a practical tension between maximizing overall welfare and ensuring uplift for all user types, and discusses design and policy implications including when to allow, or even encourage, stochasticity and misalignment in algorithmic tools. These insights inform how to design algorithmic curation tools and policies that balance personalization with fairness across heterogeneous user populations.

Abstract

In many real-life settings, algorithms play the role of assistants, while humans ultimately make the final decision. Often, algorithms specifically act as curators, narrowing down a wide range of options into a smaller subset that the human picks between: consider content recommendation or chatbot responses to questions with multiple valid answers. Crucially, humans may not know their own preferences perfectly either, but instead may only have access to a noisy sampling over preferences. Algorithms can assist humans by curating a smaller subset of items, but must also face the challenge of misalignment: humans may have different preferences from each other (and from the algorithm), and the algorithm may not know the exact preferences of the human they are facing at any point in time. In this paper, we model and theoretically study such a setting. Specifically, we show instances where humans benefit by collaborating with a misaligned algorithm. Surprisingly, we show that humans gain more utility from a misaligned algorithm (which makes different mistakes) than from an aligned algorithm. Next, we build on this result by studying what properties of algorithms maximize human welfare when the goals could be either utilitarian welfare or ensuring all humans benefit. We conclude by discussing implications for designers of algorithmic tools and policymakers.

Paper Structure

This paper contains 61 sections, 35 theorems, 55 equations, 10 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

For any item that is not $i$, the probability of picking that item is higher with the misaligned algorithm (and the probability of picking item $i$ is lower). That is, for any $r\in [m]$ with $r\neq i$, $\mathop{\mathrm{\mathbb{P}}}\limits[x_C^1 = x_r] \le \mathop{\mathrm{\mathbb{P}}}\limits[x_C^2

Figures (10)

  • Figure 1: Human-AI collaboration between Google Maps (Algorithm) and Alice (Human). Here, we assume that the algorithm is deterministic, but the human is noisy and only the best item ($x_1$) has value. If Alice picked by herself, she would pick $x_1$ 90% of the time, but when the algorithm deterministically reduces her set to $\{x_1, x_3\}$, she picks $x_1$ always.
  • Figure 2: Illustration of \ref{['thm:benefit_inv1']} with $3$ items, where the rounded node with number $i$ represents item $x_i$ and the most valuable item is in blue.
  • Figure 3: Comparison between human working with $A_1$, $A_2$, or alone, where $A_1$ & $A_2 > H$ denotes that the human performs better when assisted by either algorithm.
  • Figure 4: Utilitarian welfare and uplift achieved by the majority algorithm $A_m$, the welfare-maximizing algorithm $A_w$, and the uplift-maximizing algorithm $A_u$ across varying levels of human accuracy.
  • Figure 5: Comparison of human's expected utility differences after collaboration with a misaligned and an aligned algorithm, as a function of $\beta$. Each curve represents an algorithm with the corresponding ground-truth ranking.
  • ...and 5 more figures

Theorems & Definitions (62)

  • Definition 1: Inversion-monotonicity and label-invariance
  • Definition 2: Expected utilitarian social welfare
  • Definition 3: Uplift
  • Lemma 1
  • Theorem 1
  • Example 1
  • Theorem 2: Best/worst strategy for top item recovery
  • Example 2
  • Lemma 2
  • Lemma 3
  • ...and 52 more