Incentivized Exploration via Filtered Posterior Sampling

Anand Kalvit; Aleksandrs Slivkins; Yonatan Gur

Incentivized Exploration via Filtered Posterior Sampling

Anand Kalvit, Aleksandrs Slivkins, Yonatan Gur

TL;DR

The paper studies incentivized exploration (IE) in sequential-agent settings where a principal can influence exploratory actions through information signals. It proposes filtered posterior sampling, a semantics-consistent Thompson Sampling variant, and proves Bayesian incentive-compatibility (BIC) under a warm-start spectral-diversity condition, unifying analyses across private/public types and correlated priors. The work derives corollaries for private types, informative recommendations, sleeping bandits, combinatorial semi-bandits, and linear bandits with public types, and also shows that other native algorithms (e.g., UCB, filtered least squares) admit similar BIC guarantees under analogous conditions. It provides instance-dependent guarantees and demonstrates the broad applicability of posterior sampling as a general IE tool in complex, heterogeneous recommendation settings. Overall, the framework offers a unified, principled approach to incentivized exploration with practical implications for modern, heterogeneous recommender systems.

Abstract

We study "incentivized exploration" (IE) in social learning problems where the principal (a recommendation algorithm) can leverage information asymmetry to incentivize sequentially-arriving agents to take exploratory actions. We identify posterior sampling, an algorithmic approach that is well known in the multi-armed bandits literature, as a general-purpose solution for IE. In particular, we expand the existing scope of IE in several practically-relevant dimensions, from private agent types to informative recommendations to correlated Bayesian priors. We obtain a general analysis of posterior sampling in IE which allows us to subsume these extended settings as corollaries, while also recovering existing results as special cases.

Incentivized Exploration via Filtered Posterior Sampling

TL;DR

Abstract

Paper Structure (38 sections, 24 theorems, 63 equations, 2 figures)

This paper contains 38 sections, 24 theorems, 63 equations, 2 figures.

Introduction
Related work
Model: Incentivized Exploration
Filtered posterior sampling: Main Results and Corollaries
Private agent types
Informative recommendations
Correlated Bayesian priors
Filtered posterior sampling: Additional Corollaries
Sleeping bandits
Combinatorial semi-bandits
Linear bandits with public types
Incentivized exploration via other native algorithms
Filtered least squares algorithm for linear bandits with private types
UCB and Frequentist-Greedy algorithms for K-armed bandits
Standard definitions, facts, and results used in analyses
...and 23 more sections

Key Result

Theorem 1

Assume that $\mathtt{\delta}_{0}{(\mathscr{Q})}>0$, as per Eq. (eqn:primitives3). Fix $\varepsilon>0$ and suppose that the spectral diversity of the warm-up data satisfies $\lambda_{{{[T_0]}}}\gtrsim \Lambda(\varepsilon) := {{\left( {D/\varepsilon^2} \right)}}\log{{\left( {2/\mathtt{\delta}_{0}{(\ma Then, filtered posterior sampling is $g(\varepsilon)$-BIC, with

Figures (2)

Figure 1: Protocol: Incentivized Exploration
Figure 2: "Semantics-consistent" messaging policy

Theorems & Definitions (43)

Remark 2.1
Remark 2.2
Definition 1
Remark 2.3
Definition 2: Menu-consistency
Remark 3.1
Remark 3.2
Theorem 1: General Guarantee
Remark 3.3
Corollary 1: private types
...and 33 more

Incentivized Exploration via Filtered Posterior Sampling

TL;DR

Abstract

Incentivized Exploration via Filtered Posterior Sampling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (43)