Meta-Learning for Repeated Bayesian Persuasion

Ata Poyraz Turna; Asrin Efe Yorulmaz; Tamer Başar

Meta-Learning for Repeated Bayesian Persuasion

Ata Poyraz Turna, Asrin Efe Yorulmaz, Tamer Başar

Abstract

Classical Bayesian persuasion studies how a sender influences receivers through carefully designed signaling policies within a single strategic interaction. In many real-world environments, such interactions are repeated across multiple games, creating opportunities to exploit structural similarity across tasks. In this work, we introduce Meta-Persuasion algorithms, establishing the first line of theoretical results for both full-feedback and bandit-feedback settings in the Online Bayesian Persuasion (OBP) and Markov Persuasion Process (MPP) frameworks. We show that our proposed meta-persuasion algorithms achieve provably sharper regret rates under natural notions of task similarity, improving upon the best-known convergence rates for both OBP and MPP. At the same time, they recover the standard single-game guarantees when the sequence of games is picked arbitrarily. Finally, we complement our theoretical analysis with numerical experiments that highlight our regret improvements and the benefits of meta-learning in repeated persuasion environments.

Meta-Learning for Repeated Bayesian Persuasion

Abstract

Paper Structure (26 sections, 27 theorems, 137 equations, 4 figures, 8 algorithms)

This paper contains 26 sections, 27 theorems, 137 equations, 4 figures, 8 algorithms.

Introduction
Related Work
Preliminaries
Online Bayesian Persuasion
Markov Persuasion Processes
Meta-Learning Across Repeated Games
The Meta-Learning for Online Bayesian Persuasion
Full Feedback Setting
Partial Feedback Setting
The Meta-Learning for Markov Persuasion Processes
Estimators and Confidence Bounds for Meta-Learning in Markov Persuasion Processes
Full Feedback Setting
Partial Feedback Setting
Numerical Results
Numerical Results for Online Bayesian Persuasion
...and 11 more sections

Key Result

Theorem 3.1

For any set $\mathcal{J} \subset \mathbb{R}^{K}$ and any point $\bar{z}$ in its convex hull $\bar{\mathcal{J}}$, there exist at most $K+1$ points $z^{1},\ldots,z^{n} \in \mathcal{J}$ with $n \le K+1$ such that $z = \sum_{i=1}^{n} \lambda_{i}\, x^{i}$, where $\lambda_{i} \ge 0$ and $\sum_{i=1}^{n} \l

Figures (4)

Figure 1: Illustration of the Carathéodory oracle used in Algorithms \ref{['alg:omd-tuning-fb']} and \ref{['alg:omd-tuning']}.
Figure :
Figure :
Figure :

Theorems & Definitions (46)

Theorem 3.1: Carathéodory's Theorem
Theorem 3.2
Theorem 3.3
Corollary 3.1: Corollary 5.1 in MetaLearningBandits
Lemma 4.1: Lemma 1, MarkovPersuasionScratch2025
Theorem 4.1
Theorem 4.2
Theorem 4.3
Lemma A.1
Lemma A.2
...and 36 more

Meta-Learning for Repeated Bayesian Persuasion

Abstract

Meta-Learning for Repeated Bayesian Persuasion

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (46)