Table of Contents
Fetching ...

Learning to Persuade on the Fly: Robustness Against Ignorance

You Zu, Krishnamurthy Iyer, Haifeng Xu

TL;DR

This work studies repeated Bayesian persuasion when the payoff-relevant state distribution is unknown to sender and receivers. The authors introduce a robust persuasiveness criterion and an online signaling algorithm, \mathfrak{Rai}, that maintains an $\ell_1$-ball around the empirical distribution and chooses signaling rules persuasive for all distributions within the ball. They prove that, with high probability, the algorithm achieves $O(\sqrt{T\log T})$ regret against the optimal signaling mechanism with known distribution, and they establish a matching $\Omega(\sqrt{T})$ lower bound, thereby characterizing the value of knowing the state distribution in this setting. The analysis introduces the cost of robust persuasion via the Gap metric and leverages concentration in Banach spaces to bound regret, offering insights for prior-independent mechanism design and robust online learning in information design.

Abstract

Motivated by information sharing in online platforms, we study repeated persuasion between a sender and a stream of receivers where at each time, the sender observes a payoff-relevant state drawn independently and identically from an unknown distribution, and shares state information with the receivers who each choose an action. The sender seeks to persuade the receivers into taking actions aligned with the sender's preference by selectively sharing state information. However, in contrast to the standard models, neither the sender nor the receivers know the distribution, and the sender has to persuade while learning the distribution on the fly. We study the sender's learning problem of making persuasive action recommendations to achieve low regret against the optimal persuasion mechanism with the knowledge of the distribution. To do this, we first propose and motivate a persuasiveness criterion for the unknown distribution setting that centers robustness as a requirement in the face of uncertainty. Our main result is an algorithm that, with high probability, is robustly-persuasive and achieves $O(\sqrt{T\log T})$ regret, where $T$ is the horizon length. Intuitively, at each time our algorithm maintains a set of candidate distributions, and chooses a signaling mechanism that is simultaneously persuasive for all of them. Core to our proof is a tight analysis about the cost of robust persuasion, which may be of independent interest. We further prove that this regret order is optimal (up to logarithmic terms) by showing that no algorithm can achieve regret better than $Ω(\sqrt{T})$.

Learning to Persuade on the Fly: Robustness Against Ignorance

TL;DR

This work studies repeated Bayesian persuasion when the payoff-relevant state distribution is unknown to sender and receivers. The authors introduce a robust persuasiveness criterion and an online signaling algorithm, \mathfrak{Rai}, that maintains an -ball around the empirical distribution and chooses signaling rules persuasive for all distributions within the ball. They prove that, with high probability, the algorithm achieves regret against the optimal signaling mechanism with known distribution, and they establish a matching lower bound, thereby characterizing the value of knowing the state distribution in this setting. The analysis introduces the cost of robust persuasion via the Gap metric and leverages concentration in Banach spaces to bound regret, offering insights for prior-independent mechanism design and robust online learning in information design.

Abstract

Motivated by information sharing in online platforms, we study repeated persuasion between a sender and a stream of receivers where at each time, the sender observes a payoff-relevant state drawn independently and identically from an unknown distribution, and shares state information with the receivers who each choose an action. The sender seeks to persuade the receivers into taking actions aligned with the sender's preference by selectively sharing state information. However, in contrast to the standard models, neither the sender nor the receivers know the distribution, and the sender has to persuade while learning the distribution on the fly. We study the sender's learning problem of making persuasive action recommendations to achieve low regret against the optimal persuasion mechanism with the knowledge of the distribution. To do this, we first propose and motivate a persuasiveness criterion for the unknown distribution setting that centers robustness as a requirement in the face of uncertainty. Our main result is an algorithm that, with high probability, is robustly-persuasive and achieves regret, where is the horizon length. Intuitively, at each time our algorithm maintains a set of candidate distributions, and chooses a signaling mechanism that is simultaneously persuasive for all of them. Core to our proof is a tight analysis about the cost of robust persuasion, which may be of independent interest. We further prove that this regret order is optimal (up to logarithmic terms) by showing that no algorithm can achieve regret better than .

Paper Structure

This paper contains 22 sections, 11 theorems, 63 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

For each $t \in [T]$, let $\epsilon_t = \min\{\sqrt{\frac{|\Omega|}{t}} \left(1 + \sqrt{\Phi \log T}\right),2\}$ with $\Phi > 0$. Then, the $\mathfrak{Rai}$ algorithm is $\beta$-robustly persuasive with In particular, for $\Phi > 20$, we have $\beta \leq T^{-0.5}$.

Figures (2)

  • Figure 1: The persuasion instance $\mathcal{I}\xspace_0$.
  • Figure 2: The persuasion instance $\mathcal{I}\xspace_1$.

Theorems & Definitions (15)

  • Example 1: Content recommendations by online media platforms
  • Example 2: Recommendations on hiring platforms
  • Definition 1
  • Definition 2
  • Theorem 1
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Theorem 2
  • Theorem 3
  • ...and 5 more