Table of Contents
Fetching ...

Optimal Selection Using Algorithmic Rankings with Side Information

Kate Donahue, Nicole Immorlica, Brendan Lucier

TL;DR

The paper studies a decision-maker who selects one candidate from a noisy ranking augmented with a binary free/busy signal, revealing that increasing ranking accuracy can harm social welfare in a superstar setting. It develops a formal model using Plackett–Luce/RUM rankings, a busy penalty, and a two-choice (top free vs top busy) strategy framework with a shrinking search window as accuracy improves. The main contributions include precise conditions under which firms prefer free vs busy, analysis of welfare implications for firms and candidates, and a Beyond superstar algorithm showing the results extend to more general settings under common noise models. The work highlights non-monotone welfare effects of algorithmic accuracy and offers structural insights and algorithmic tools for when and how to deploy ranking-based decision systems in human-in-the-loop contexts.

Abstract

Motivated by online platforms such as job markets, we study an agent choosing from a list of candidates, each with a hidden quality that determines match value. The agent observes only a noisy ranking of the candidates plus a binary signal that indicates whether each candidate is "free" or "busy." Being busy is positively correlated with higher quality, but can also reduce value due to decreased availability. We study the agent's optimal selection problem in the presence of ranking noise and free-busy signals and ask how the accuracy of the ranking tool impacts outcomes. In a setting with one high-valued candidate and an arbitrary number of low-valued candidates, we show that increased accuracy of the ranking tool can result in reduced social welfare. This can occur for two reasons: agents may be more likely to make offers to busy candidates, and (paradoxically) may be more likely to select lower-ranked candidates when rankings are more indicative of quality. We further discuss conditions under which these results extend to more general settings.

Optimal Selection Using Algorithmic Rankings with Side Information

TL;DR

The paper studies a decision-maker who selects one candidate from a noisy ranking augmented with a binary free/busy signal, revealing that increasing ranking accuracy can harm social welfare in a superstar setting. It develops a formal model using Plackett–Luce/RUM rankings, a busy penalty, and a two-choice (top free vs top busy) strategy framework with a shrinking search window as accuracy improves. The main contributions include precise conditions under which firms prefer free vs busy, analysis of welfare implications for firms and candidates, and a Beyond superstar algorithm showing the results extend to more general settings under common noise models. The work highlights non-monotone welfare effects of algorithmic accuracy and offers structural insights and algorithmic tools for when and how to deploy ranking-based decision systems in human-in-the-loop contexts.

Abstract

Motivated by online platforms such as job markets, we study an agent choosing from a list of candidates, each with a hidden quality that determines match value. The agent observes only a noisy ranking of the candidates plus a binary signal that indicates whether each candidate is "free" or "busy." Being busy is positively correlated with higher quality, but can also reduce value due to decreased availability. We study the agent's optimal selection problem in the presence of ranking noise and free-busy signals and ask how the accuracy of the ranking tool impacts outcomes. In a setting with one high-valued candidate and an arbitrary number of low-valued candidates, we show that increased accuracy of the ranking tool can result in reduced social welfare. This can occur for two reasons: agents may be more likely to make offers to busy candidates, and (paradoxically) may be more likely to select lower-ranked candidates when rankings are more indicative of quality. We further discuss conditions under which these results extend to more general settings.

Paper Structure

This paper contains 23 sections, 45 theorems, 226 equations, 14 figures, 2 algorithms.

Key Result

Lemma 1

In the superstar setting, given a realized status vector $s$, it is always optimal to either pick the first (top-ranked) free item or the first busy item.

Figures (14)

  • Figure 1: Illustration of strategy in Algorithm \ref{['algo:superstar']} for firm, given $N=5$ candidates with $v_1 > v_2=0$ and permutations given by Plackett-Luce. The x axis varies the $\gamma$ busy penalty, while the y axis varies accuracy as parameterized by the $1/\beta$ Gumbel noise parameter (higher values increase accuracy). Shades of red indicate regions where first-free selection is the optimal strategy and shades of blue indicate where first-busy selection is (darker shades indicate regions where the firm has larger $j^*$ or $k^*$ and thus "hunts" further down the list to find a candidate with their preferred free/busy status. The purple region is where the optimal strategy is "follow the ranking". Note that the choice of first-free or first-busy depends only on $\gamma$, while the window size depends on the accuracy of the ranking tool.
  • Figure 2: Figures showing how changing the accuracy of the tool changes how frequently the picked candidate is busy. The top shows a firm using "first free" strategy: note that increasing accuracy uniformly increases the probability that the picked candidate is busy. The bottom figure shows a firm using first-busy strategy: note that increasing accuracy could increase or decrease the probability that the picked candidate is busy. Parameters: $v_1 = 1, v_2=0, p_1 =0.1, p_2 = 0.4$, each point run with $10^6$ simulations. First-free strategy run with $\gamma = 10$, first-busy run with $\gamma = 1.6$.
  • Figure 3: Figure showing examples where increasing accuracy could lead to increased probability of picking the top ranked candidate for first-free strategies (top figure) or decreased for first-busy (bottom). In both cases, phenomeona occurs for very low accuracy (far left part of plot). Both figures show $N=10$ with only lines shown for $i\in [1, 4]$ to show detail, with RUM with Gumbel noise. Top has $p_1 = 0.01< p_2 = 0.05$, bottom has $p_1 = 0.9, p_2 = 0.95$. Numerical simulations with $10^6$ simulations per point.
  • Figure 4: Simulation of performance of strategies with $N=4$. Each line gives the expected utility of different strategies, where "best" gives the Bayes-optimal posterior strategy and Alg gives the performance of Algorithm \ref{['algo:approximatebeyondsuperstar']} (which reduces to Algorithm \ref{['algo:superstar']} in the superstar setting). The shaded regions illustrate where different named strategies happen to also be optimal: for example, the purple region is one where "follow the ranking" is exactly optimal. Unshaded regions are those where the optimal strategy cannot be described as $k$-free or $k$-busy. However, note that the error (gap between yellow and green lines) is often extremely small. The top and middle are both superstar settings, while the bottom is beyond-the-superstar. Note that the top has 0 error always because $v_2 = 0$ (Theorem \ref{['thrm:vposntwoaccstrat']}), while the middle has nonzero error. The bottom (beyond-the-superstar) has regions with non-zero error, but largely follows the optimal strategy.
  • Figure 5: Simulation with $T=2000$ time steps, averaged over 1000 simulations, showing the average probability that the high value and low value candidates are free.
  • ...and 9 more figures

Theorems & Definitions (71)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • proof : Proof sketch
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • ...and 61 more