Table of Contents
Fetching ...

When the Universe is Too Big: Bounding Consideration Probabilities for Plackett-Luce Rankings

Ben Aoki-Sherwood, Catherine Bregou, David Liben-Nowell, Kiran Tomlinson, Thomas Zeng

Abstract

The widely used Plackett-Luce ranking model assumes that individuals rank items by making repeated choices from a universe of items. But in many cases the universe is too big for people to plausibly consider all options. In the choice literature, this issue has been addressed by supposing that individuals first sample a small consideration set and then choose among the considered items. However, inferring unobserved consideration sets (or item consideration probabilities) in this "consider then choose" setting poses significant challenges, because even simple models of consideration with strong independence assumptions are not identifiable, even if item utilities are known. We apply the consider-then-choose framework to top-$k$ rankings, where we assume rankings are constructed according to a Plackett-Luce model after sampling a consideration set. While item consideration probabilities remain non-identified in this setting, we prove that we can infer bounds on the relative values of consideration probabilities. Additionally, given a condition on the expected consideration set size and known item utilities, we derive absolute upper and lower bounds on item consideration probabilities. We also provide algorithms to tighten those bounds on consideration probabilities by propagating inferred constraints. Thus, we show that we can learn useful information about consideration probabilities despite not being able to identify them precisely. We demonstrate our methods on a ranking dataset from a psychology experiment with two different ranking tasks (one with fixed consideration sets and one with unknown consideration sets). This combination of data allows us to estimate utilities and then learn about unknown consideration probabilities using our bounds.

When the Universe is Too Big: Bounding Consideration Probabilities for Plackett-Luce Rankings

Abstract

The widely used Plackett-Luce ranking model assumes that individuals rank items by making repeated choices from a universe of items. But in many cases the universe is too big for people to plausibly consider all options. In the choice literature, this issue has been addressed by supposing that individuals first sample a small consideration set and then choose among the considered items. However, inferring unobserved consideration sets (or item consideration probabilities) in this "consider then choose" setting poses significant challenges, because even simple models of consideration with strong independence assumptions are not identifiable, even if item utilities are known. We apply the consider-then-choose framework to top- rankings, where we assume rankings are constructed according to a Plackett-Luce model after sampling a consideration set. While item consideration probabilities remain non-identified in this setting, we prove that we can infer bounds on the relative values of consideration probabilities. Additionally, given a condition on the expected consideration set size and known item utilities, we derive absolute upper and lower bounds on item consideration probabilities. We also provide algorithms to tighten those bounds on consideration probabilities by propagating inferred constraints. Thus, we show that we can learn useful information about consideration probabilities despite not being able to identify them precisely. We demonstrate our methods on a ranking dataset from a psychology experiment with two different ranking tasks (one with fixed consideration sets and one with unknown consideration sets). This combination of data allows us to estimate utilities and then learn about unknown consideration probabilities using our bounds.
Paper Structure (15 sections, 27 theorems, 66 equations, 2 figures, 4 algorithms)

This paper contains 15 sections, 27 theorems, 66 equations, 2 figures, 4 algorithms.

Key Result

Theorem 4.0

For all $n \ge 1$ and $1 \le k \le n$, consideration probabilities are not identifiable in the PL+C model. That is, there are multiple sets of consideration probabilities that generate the same distribution over rankings, with fixed utilities $u_i$ for each $i \in \mathcal{U}$.

Figures (2)

  • Figure 1: Left: the transitive reduction of the graph $G$ produced by \ref{['alg:lower-bounds']} on the data from putnam2018collective; right: feasible consideration probability intervals for the 50 U.S. states. On the left, an edge from state $i$ to state $j$ indicates that $i$'s inferred utility is larger than $j$'s but that $j$ appears in the top $\ell$ positions more often than $i$ (for some $\ell \in \left\{1, 2, 3\right\}$). Thus, we conclude from \ref{['thm:consider-flip']} that $p_i \le p_j$. Nodes are labeled with states' postal abbreviations. On the right, the light blue intervals show the bounds from Theorems \ref{['thm:consideration-initial-LB']} and \ref{['thm:upper-bound:repeated-applications']}, while the smaller black intervals show the bounds after tightening with \ref{['alg:lower-bounds', 'alg:upper-bounds']}. Red diamonds show state utilities learned by PL on the Random-10 data, scaled additively so the minimum utility is 0.
  • Figure 2: Versions of \ref{['fig:swaps-bounds']} (right) with $\alpha$ varying from $2$ to $7$, showing how our bounds change as the lower bound on average number of states considered grows from $\alpha k = 6$ to $\alpha k = 21$.

Theorems & Definitions (41)

  • Theorem 4.0
  • Lemma 4.0
  • Theorem 4.1
  • Theorem 4.1
  • Lemma 4.1
  • Lemma 5.0
  • Theorem 5.0
  • Theorem 5.0
  • Lemma 6.0
  • Lemma 6.0
  • ...and 31 more