Table of Contents
Fetching ...

Incentivizing Exploration with Selective Data Disclosure

Nicole Immorlica, Jieming Mao, Aleksandrs Slivkins, Zhiwei Steven Wu

TL;DR

This is the first paper in the literature on incentivized exploration (and possibly in the broader literature on "learning and incentives") which attempts to mitigate the limitations of standard economic assumptions.

Abstract

We propose and design recommendation systems that incentivize efficient exploration. Agents arrive sequentially, choose actions and receive rewards, drawn from fixed but unknown action-specific distributions. The recommendation system presents each agent with actions and rewards from a subsequence of past agents, chosen ex ante. Thus, the agents engage in sequential social learning, moderated by these subsequences. We asymptotically attain optimal regret rate for exploration, using a flexible frequentist behavioral model and mitigating rationality and commitment assumptions inherent in prior work. We suggest three components of effective recommendation systems: independent focus groups, group aggregators, and interlaced information structures.

Incentivizing Exploration with Selective Data Disclosure

TL;DR

This is the first paper in the literature on incentivized exploration (and possibly in the broader literature on "learning and incentives") which attempts to mitigate the limitations of standard economic assumptions.

Abstract

We propose and design recommendation systems that incentivize efficient exploration. Agents arrive sequentially, choose actions and receive rewards, drawn from fixed but unknown action-specific distributions. The recommendation system presents each agent with actions and rewards from a subsequence of past agents, chosen ex ante. Thus, the agents engage in sequential social learning, moderated by these subsequences. We asymptotically attain optimal regret rate for exploration, using a flexible frequentist behavioral model and mitigating rationality and commitment assumptions inherent in prior work. We suggest three components of effective recommendation systems: independent focus groups, group aggregators, and interlaced information structures.

Paper Structure

This paper contains 25 sections, 22 theorems, 52 equations, 6 figures.

Key Result

Theorem 4.2

The two-level policy with parameter $T_1 = T^{2/3}\,(\log T)^{1/3}$ achieves regret

Figures (6)

  • Figure 1: The information flow graph for a full disclosure policy.
  • Figure 2: Info-graph for the 2-level policy.
  • Figure 3: Info-graph for the three-level policy. Each red box in level 1 corresponds to $T_1$ full-disclosure paths of length $L^\mathrm{fd}_K$ each.
  • Figure 4: Connections between levels for the $L$-level policy, for $\sigma=2$.
  • Figure 5: PoIE as the path length grows: $N_{\mathtt{est}\xspace}=1$ (left) and $N_{\mathtt{est}\xspace}\in\left\{\,1,2,3,4\,\right\}$ (right).
  • ...and 1 more figures

Theorems & Definitions (47)

  • Example 3.1
  • Definition 4.1
  • Theorem 4.2
  • Remark 4.3
  • Remark 4.4
  • Remark 4.5
  • Example 4.6
  • Example 4.7
  • Theorem 5.2
  • Remark 5.3
  • ...and 37 more