Table of Contents
Fetching ...

Incentive-Aware Recommender Systems in Two-Sided Markets

Xiaowu Dai, Wenlu Xu, Yuan Qi, Michael I. Jordan

TL;DR

This paper addresses incentive-compatible exploration in two-sided online marketplaces where users may opt to exploit current information. It introduces two algorithms, ARP and MARP, that combine information design and randomized recommendations to induce exploration while respecting agents’ opportunity costs, achieving sublinear regret and ex-post fairness. ARP covers scenarios with known costs, while MARP handles private, unknown costs, with theoretical guarantees and empirical validation on synthetic and real data. The work has practical implications for platforms like social networks and ride-hailing services, and points to extensions for contextual bandits and adaptive clinical trials, with code available for replication.

Abstract

Online platforms in the Internet Economy commonly incorporate recommender systems that recommend products (or "arms") to users (or "agents"). A key challenge in this domain arises from myopic agents who are naturally incentivized to exploit by choosing the optimal arm based on current information, rather than exploring various alternatives to gather information that benefits the collective. We propose a novel recommender system that aligns with agents' incentives while achieving asymptotically optimal performance, as measured by regret in repeated interactions. Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets, where the interactions of agents and arms are facilitated by recommender systems on online platforms. This model incorporates incentive constraints induced by agents' opportunity costs. In scenarios where opportunity costs are known to the platform, we show the existence of an incentive-compatible recommendation algorithm. This algorithm pools recommendations between a genuinely good arm and an unknown arm using a randomized and adaptive strategy. Moreover, when these opportunity costs are unknown, we introduce an algorithm that randomly pools recommendations across all arms, utilizing the cumulative loss from each arm as feedback for strategic exploration. We demonstrate that both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation. All code for using the proposed algorithms and reproducing results is made available on GitHub.

Incentive-Aware Recommender Systems in Two-Sided Markets

TL;DR

This paper addresses incentive-compatible exploration in two-sided online marketplaces where users may opt to exploit current information. It introduces two algorithms, ARP and MARP, that combine information design and randomized recommendations to induce exploration while respecting agents’ opportunity costs, achieving sublinear regret and ex-post fairness. ARP covers scenarios with known costs, while MARP handles private, unknown costs, with theoretical guarantees and empirical validation on synthetic and real data. The work has practical implications for platforms like social networks and ride-hailing services, and points to extensions for contextual bandits and adaptive clinical trials, with code available for replication.

Abstract

Online platforms in the Internet Economy commonly incorporate recommender systems that recommend products (or "arms") to users (or "agents"). A key challenge in this domain arises from myopic agents who are naturally incentivized to exploit by choosing the optimal arm based on current information, rather than exploring various alternatives to gather information that benefits the collective. We propose a novel recommender system that aligns with agents' incentives while achieving asymptotically optimal performance, as measured by regret in repeated interactions. Our framework models this incentive-aware system as a multi-agent bandit problem in two-sided markets, where the interactions of agents and arms are facilitated by recommender systems on online platforms. This model incorporates incentive constraints induced by agents' opportunity costs. In scenarios where opportunity costs are known to the platform, we show the existence of an incentive-compatible recommendation algorithm. This algorithm pools recommendations between a genuinely good arm and an unknown arm using a randomized and adaptive strategy. Moreover, when these opportunity costs are unknown, we introduce an algorithm that randomly pools recommendations across all arms, utilizing the cumulative loss from each arm as feedback for strategic exploration. We demonstrate that both algorithms satisfy an ex-post fairness criterion, which protects agents from over-exploitation. All code for using the proposed algorithms and reproducing results is made available on GitHub.
Paper Structure (43 sections, 7 theorems, 37 equations, 7 figures, 4 tables, 2 algorithms)

This paper contains 43 sections, 7 theorems, 37 equations, 7 figures, 4 tables, 2 algorithms.

Key Result

theorem 1

Suppose that the assumption in Eq. eqn:positiveplat holds and parameters $\lambda,\theta_\tau, k$ are chosen according to Eqs. eqn:choiceoflambda, eqn:thetatau, and eqn:choiceofk, respectively. Then ARP in Algorithm alg:adaptivesampling guarantees the agent's incentive in Eq. eqn:incentivesct for an

Figures (7)

  • Figure 1: The recommendation process mediated by the designer for online marketplaces.
  • Figure 2: Illustrating paths of $p_{i,t}$ for first-best and second-best policies.
  • Figure 3: (Second-Best) The exploration rate $p_{i,t}$ in Eq. \ref{['eqn:choiceofLi']} for arm $i>1$.
  • Figure 4: The mean regret of ARP and alternative algorithms for Section \ref{['sec:bernoullibandit']}, based on 500 data replications. The three plots correspond to $c_*=0.05$, $c_* = 0.10$, and $c_*=0.15$, respectively.
  • Figure 5: The mean regret of MARP and alternative algorithms for Section \ref{['sec:bernoullibandit']}, based on 500 data replications. The three plots correspond to $c_t\sim Beta(0.9, 0.9)$, $c_t\sim Beta(1.1, 1.0)$, and $c_t\sim Beta(1.0, 1.1)$, respectively.
  • ...and 2 more figures

Theorems & Definitions (7)

  • theorem 1
  • theorem 2
  • theorem 3
  • theorem 4
  • theorem 5
  • theorem 6
  • theorem 7