Table of Contents
Fetching ...

Meta-Learning for Repeated Bayesian Persuasion

Ata Poyraz Turna, Asrin Efe Yorulmaz, Tamer Başar

Abstract

Classical Bayesian persuasion studies how a sender influences receivers through carefully designed signaling policies within a single strategic interaction. In many real-world environments, such interactions are repeated across multiple games, creating opportunities to exploit structural similarity across tasks. In this work, we introduce Meta-Persuasion algorithms, establishing the first line of theoretical results for both full-feedback and bandit-feedback settings in the Online Bayesian Persuasion (OBP) and Markov Persuasion Process (MPP) frameworks. We show that our proposed meta-persuasion algorithms achieve provably sharper regret rates under natural notions of task similarity, improving upon the best-known convergence rates for both OBP and MPP. At the same time, they recover the standard single-game guarantees when the sequence of games is picked arbitrarily. Finally, we complement our theoretical analysis with numerical experiments that highlight our regret improvements and the benefits of meta-learning in repeated persuasion environments.

Meta-Learning for Repeated Bayesian Persuasion

Abstract

Classical Bayesian persuasion studies how a sender influences receivers through carefully designed signaling policies within a single strategic interaction. In many real-world environments, such interactions are repeated across multiple games, creating opportunities to exploit structural similarity across tasks. In this work, we introduce Meta-Persuasion algorithms, establishing the first line of theoretical results for both full-feedback and bandit-feedback settings in the Online Bayesian Persuasion (OBP) and Markov Persuasion Process (MPP) frameworks. We show that our proposed meta-persuasion algorithms achieve provably sharper regret rates under natural notions of task similarity, improving upon the best-known convergence rates for both OBP and MPP. At the same time, they recover the standard single-game guarantees when the sequence of games is picked arbitrarily. Finally, we complement our theoretical analysis with numerical experiments that highlight our regret improvements and the benefits of meta-learning in repeated persuasion environments.
Paper Structure (26 sections, 27 theorems, 137 equations, 4 figures, 8 algorithms)

This paper contains 26 sections, 27 theorems, 137 equations, 4 figures, 8 algorithms.

Key Result

Theorem 3.1

For any set $\mathcal{J} \subset \mathbb{R}^{K}$ and any point $\bar{z}$ in its convex hull $\bar{\mathcal{J}}$, there exist at most $K+1$ points $z^{1},\ldots,z^{n} \in \mathcal{J}$ with $n \le K+1$ such that $z = \sum_{i=1}^{n} \lambda_{i}\, x^{i}$, where $\lambda_{i} \ge 0$ and $\sum_{i=1}^{n} \l

Figures (4)

  • Figure 1: Illustration of the Carathéodory oracle used in Algorithms \ref{['alg:omd-tuning-fb']} and \ref{['alg:omd-tuning']}.
  • Figure :
  • Figure :
  • Figure :

Theorems & Definitions (46)

  • Theorem 3.1: Carathéodory's Theorem
  • Theorem 3.2
  • Theorem 3.3
  • Corollary 3.1: Corollary 5.1 in MetaLearningBandits
  • Lemma 4.1: Lemma 1, MarkovPersuasionScratch2025
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Lemma A.1
  • Lemma A.2
  • ...and 36 more