Collaborative Information Dissemination with Graph-based Multi-Agent Reinforcement Learning
Raffaele Galliera, Kristen Brent Venable, Matteo Bassani, Niranjan Suri
TL;DR
The paper tackles efficient information dissemination in dynamic broadcast networks by framing the problem as a Partially Observable Stochastic Game (POSG) and solving it with Graph Convolutional Reinforcement Learning using Graph Attention Networks. Two architectures, Local-DyAN (L-DyAN) and Hyperlocal-DyAN (HL-DyAN), leverage dynamic attention to capture local network structure and enable one-hop observations to guide forwarding decisions, reducing redundant transmissions while maximizing coverage. The approaches are trained with cooperative dynamic neighborhoods and a dueling Q-network, achieving superior network coverage and lower overhead compared to state-of-the-art heuristics like RFC7188 and baseline graph networks, especially under higher mobility. The work demonstrates the practicality of graph-based MARL for decentralized information dissemination, with potential impact on disaster response, sensor networks, and vehicular communications, and lays groundwork for future task- and priority-aware extensions.
Abstract
Efficient information dissemination is crucial for supporting critical operations across domains like disaster response, autonomous vehicles, and sensor networks. This paper introduces a Multi-Agent Reinforcement Learning (MARL) approach as a significant step forward in achieving more decentralized, efficient, and collaborative information dissemination. We propose a Partially Observable Stochastic Game (POSG) formulation for information dissemination empowering each agent to decide on message forwarding independently, based on the observation of their one-hop neighborhood. This constitutes a significant paradigm shift from heuristics currently employed in real-world broadcast protocols. Our novel approach harnesses Graph Convolutional Reinforcement Learning and Graph Attention Networks (GATs) with dynamic attention to capture essential network features. We propose two approaches, L-DyAN and HL-DyAN, which differ in terms of the information exchanged among agents. Our experimental results show that our trained policies outperform existing methods, including the state-of-the-art heuristic, in terms of network coverage as well as communication overhead on dynamic networks of varying density and behavior.
