Table of Contents
Fetching ...

Flooding with Absorption: An Efficient Protocol for Heterogeneous Bandits over Complex Networks

Junghyun Lee, Laura Schmid, Se-Young Yun

TL;DR

The paper studies collaborative multi-agent stochastic bandits with heterogeneous arm sets distributed over general networks. It analyzes standard flooding with UCB, derives a topology-aware regret bound using clique covers, and introduces Flooding with Absorption (FwA) to reduce communication while preserving near-Flooding performance. Theoretical results provide regret bounds for both Flooding and FwA and a matching lower bound in Gaussian settings, while experiments on static and dynamic networks show that FwA dramatically reduces communication and link congestion with only modest regret loss. The work demonstrates a principled integration of network diffusion and arm heterogeneity, offering a scalable, topology-agnostic protocol for large, evolving networks.

Abstract

Multi-armed bandits are extensively used to model sequential decision-making, making them ubiquitous in many real-life applications such as online recommender systems and wireless networking. We consider a multi-agent setting where each agent solves their own bandit instance endowed with a different set of arms. Their goal is to minimize their group regret while collaborating via some communication protocol over a given network. Previous literature on this problem only considered arm heterogeneity and networked agents separately. In this work, we introduce a setting that encompasses both features. For this novel setting, we first provide a rigorous regret analysis for a standard flooding protocol combined with the classic UCB policy. Then, to mitigate the issue of high communication costs incurred by flooding in complex networks, we propose a new protocol called Flooding with Absorption (FwA). We provide a theoretical analysis of the resulting regret bound and discuss the advantages of using FwA over flooding. Lastly, we experimentally verify on various scenarios, including dynamic networks, that FwA leads to significantly lower communication costs despite minimal regret performance loss compared to other network protocols.

Flooding with Absorption: An Efficient Protocol for Heterogeneous Bandits over Complex Networks

TL;DR

The paper studies collaborative multi-agent stochastic bandits with heterogeneous arm sets distributed over general networks. It analyzes standard flooding with UCB, derives a topology-aware regret bound using clique covers, and introduces Flooding with Absorption (FwA) to reduce communication while preserving near-Flooding performance. Theoretical results provide regret bounds for both Flooding and FwA and a matching lower bound in Gaussian settings, while experiments on static and dynamic networks show that FwA dramatically reduces communication and link congestion with only modest regret loss. The work demonstrates a principled integration of network diffusion and arm heterogeneity, offering a scalable, topology-agnostic protocol for large, evolving networks.

Abstract

Multi-armed bandits are extensively used to model sequential decision-making, making them ubiquitous in many real-life applications such as online recommender systems and wireless networking. We consider a multi-agent setting where each agent solves their own bandit instance endowed with a different set of arms. Their goal is to minimize their group regret while collaborating via some communication protocol over a given network. Previous literature on this problem only considered arm heterogeneity and networked agents separately. In this work, we introduce a setting that encompasses both features. For this novel setting, we first provide a rigorous regret analysis for a standard flooding protocol combined with the classic UCB policy. Then, to mitigate the issue of high communication costs incurred by flooding in complex networks, we propose a new protocol called Flooding with Absorption (FwA). We provide a theoretical analysis of the resulting regret bound and discuss the advantages of using FwA over flooding. Lastly, we experimentally verify on various scenarios, including dynamic networks, that FwA leads to significantly lower communication costs despite minimal regret performance loss compared to other network protocols.
Paper Structure (23 sections, 8 theorems, 32 equations, 7 figures, 1 table, 2 algorithms)

This paper contains 23 sections, 8 theorems, 32 equations, 7 figures, 1 table, 2 algorithms.

Key Result

Theorem 2

Algorithm alg:ucb-flooding with ${\color{blue}absorb}=False$, $f(t) = t^\alpha$, $\alpha > \max\left( \frac{1}{2}, \frac{2\sigma^2}{\gamma + 1} \right)$, and $\gamma \in \{1, \cdots, \mathrm{diam}({\mathcal{G}})\}$ achieves the group regret upper bound where

Figures (7)

  • Figure 1: Communication network and arm heterogeneity.
  • Figure 2: Flooding with Absorption (FwA). a, An agent ($v_1$) pulls one of its arms ($a_2$). b, $v_1$ sends a message $m$ to its neighbors, with a TTL $\gamma$. c, Since one receiver of the message ($v_4$) does not have $a_2$ in its arm set, they forward $m$ to their neighbors except the originator $v_1$. The other receiver ($v_2$) has arm $a_2$ in their arm set, and thus it absorbs $m$.
  • Figure 3: Comparing group regret and (cumulative) communication complexity across different topologies and protocols. Note that FwA gives a good trade-off between regret and communication complexity.
  • Figure 4: FwA significantly decreases congestion on sparse network links. We find that, in comparison with other protocols, FwA results in a reduced number of messages sent over such a sparse link (highlighted in the networks).
  • Figure 5: Comparing group regret and (cumulative) communication complexity in a dynamic network setting. Note that FwA achieves the same regret as Flooding, with much lesser cumulative communication complexity.
  • ...and 2 more figures

Theorems & Definitions (14)

  • Definition 1
  • Theorem 2
  • Corollary 3
  • Remark 4
  • Definition 5
  • Definition 6
  • Remark 7
  • Theorem 8
  • Corollary 9
  • Definition 10
  • ...and 4 more