Table of Contents
Fetching ...

Incentivized Truthful Communication for Federated Bandits

Zhepei Wei, Chuanhao Li, Tianze Ren, Haifeng Xu, Hongning Wang

TL;DR

This work addresses incentivized communication in federated bandits by designing Truth-FedBan, a mechanism that enforces truthful reporting of participation costs via a monotone selection rule and a critical-value payment scheme. By reformulating the problem as a log-determinant submodular set cover, Truth-FedBan achieves a constant-factor bi-criteria approximation that minimizes social cost while preserving near-optimal regret and sublinear communication. Theoretical guarantees show truthfulness, individual rationality, and near-optimal learning performance, and empirical results validate robust performance against misreporting and competitive baselines. The approach enables practical, incentive-compatible federated learning with provable efficiency and offers avenues for extension to broader distributed learning settings and adversarial scenarios.

Abstract

To enhance the efficiency and practicality of federated bandit learning, recent advances have introduced incentives to motivate communication among clients, where a client participates only when the incentive offered by the server outweighs its participation cost. However, existing incentive mechanisms naively assume the clients are truthful: they all report their true cost and thus the higher cost one participating client claims, the more the server has to pay. Therefore, such mechanisms are vulnerable to strategic clients aiming to optimize their own utility by misreporting. To address this issue, we propose an incentive compatible (i.e., truthful) communication protocol, named Truth-FedBan, where the incentive for each participant is independent of its self-reported cost, and reporting the true cost is the only way to achieve the best utility. More importantly, Truth-FedBan still guarantees the sub-linear regret and communication cost without any overheads. In other words, the core conceptual contribution of this paper is, for the first time, demonstrating the possibility of simultaneously achieving incentive compatibility and nearly optimal regret in federated bandit learning. Extensive numerical studies further validate the effectiveness of our proposed solution.

Incentivized Truthful Communication for Federated Bandits

TL;DR

This work addresses incentivized communication in federated bandits by designing Truth-FedBan, a mechanism that enforces truthful reporting of participation costs via a monotone selection rule and a critical-value payment scheme. By reformulating the problem as a log-determinant submodular set cover, Truth-FedBan achieves a constant-factor bi-criteria approximation that minimizes social cost while preserving near-optimal regret and sublinear communication. Theoretical guarantees show truthfulness, individual rationality, and near-optimal learning performance, and empirical results validate robust performance against misreporting and competitive baselines. The approach enables practical, incentive-compatible federated learning with provable efficiency and offers avenues for extension to broader distributed learning settings and adversarial scenarios.

Abstract

To enhance the efficiency and practicality of federated bandit learning, recent advances have introduced incentives to motivate communication among clients, where a client participates only when the incentive offered by the server outweighs its participation cost. However, existing incentive mechanisms naively assume the clients are truthful: they all report their true cost and thus the higher cost one participating client claims, the more the server has to pay. Therefore, such mechanisms are vulnerable to strategic clients aiming to optimize their own utility by misreporting. To address this issue, we propose an incentive compatible (i.e., truthful) communication protocol, named Truth-FedBan, where the incentive for each participant is independent of its self-reported cost, and reporting the true cost is the only way to achieve the best utility. More importantly, Truth-FedBan still guarantees the sub-linear regret and communication cost without any overheads. In other words, the core conceptual contribution of this paper is, for the first time, demonstrating the possibility of simultaneously achieving incentive compatibility and nearly optimal regret in federated bandit learning. Extensive numerical studies further validate the effectiveness of our proposed solution.
Paper Structure (28 sections, 14 theorems, 32 equations, 4 figures, 1 table, 7 algorithms)

This paper contains 28 sections, 14 theorems, 32 equations, 4 figures, 1 table, 7 algorithms.

Key Result

Proposition 5

Algorithm alg:truth_incen_search is monotone.

Figures (4)

  • Figure 1: Comparison between Truth-FedBan and vanilla greedy incentive mechanism.
  • Figure 2: Overall impact of misreporting.
  • Figure : Truthful Incentive Search
  • Figure : Greedy Incentive Search (V2)

Theorems & Definitions (19)

  • Definition 1: Truthfulness
  • Definition 2: Social Cost
  • Definition 3: Monotonicity
  • Definition 4: Critical Payment
  • Proposition 5: Monotonicity
  • Lemma 6
  • Lemma 7: Elimination of Infinite Critical Value
  • Theorem 8
  • Theorem 9: Social Cost
  • Theorem 10: Regret and Communication Cost
  • ...and 9 more