Learning-Based Robust Bayesian Persuasion with Conformal Prediction Guarantees
Heeseung Bang, Andreas A. Malikopoulos
TL;DR
This work addresses robust Bayesian persuasion when receivers form beliefs via private or non-Bayesian processes by marrying neural networks with conformal prediction to produce finite-sample, distribution-free guarantees. The core idea is to learn an end-to-end mapping from receiver observations, sender signals, and signaling policies to action distributions, while enclosing possible receiver actions in conformal prediction sets $C_{1-\alpha}(Y,S,\pi)$. It provides exact coverage guarantees for data-generating policies and derives bounds on coverage degradation under policy shifts, along with neural-network approximation and estimation error bounds and a finite-sample lower bound on sender utility. Numerical experiments in smart-grid energy management demonstrate robustness to private information and behavioral heterogeneity, with practical guidance for policy transfer and calibration in real systems.
Abstract
Classical Bayesian persuasion assumes that senders fully understand how receivers form beliefs and make decisions--an assumption that rarely holds when receivers possess private information or exhibit non-Bayesian behavior. In this paper, we develop a learning-based framework that integrates neural networks with conformal prediction to achieve robust persuasion under uncertainty about receiver belief formation. The proposed neural architecture learns end-to-end mappings from receiver observations and sender signals to action predictions, eliminating the need to identify belief mechanisms explicitly. Conformal prediction constructs finite-sample valid prediction sets with provable marginal coverage, enabling principled, distribution-free robust optimization. We establish exact coverage guarantees for the data-generating policy and derive bounds on coverage degradation under policy shifts. Furthermore, we provide neural network approximation and estimation error bounds, with sample complexity $O(d \log(|\mathcal{U}||\mathcal{Y}||\mathcal{S}|)/\varepsilon^2)$, where $d$ denotes the effective network dimension, and finite-sample lower bounds on the sender's expected utility. Numerical experiments on smart-grid energy management illustrate the framework's robustness.
