Dynamic Graph Communication for Decentralised Multi-Agent Reinforcement Learning
Ben McClusky
TL;DR
This work addresses the challenge of decentralized multi-agent reinforcement learning in dynamic networks where topology changes and node failures complicate coordination. It extends the NetMon framework by integrating a Graph Attention Network layer into recurrent message passing and introducing a multi-round, attention-guided Iteration Controller to selectively propagate information, trained end-to-end with reinforcement learning under CTDE. Empirically, the approach yields up to 9.5% higher rewards and 6.4% lower communication overhead in dynamic network packet routing compared with baselines, and a 4.8% improvement from the GAT-based aggregation in dynamic settings, alongside improved graph representations and resilience to failures. The results demonstrate the potential for scalable, efficient, and robust decentralized routing in real-world networks, while the work also discusses ethical considerations, limitations, and directions for future research.
Abstract
This work presents a novel communication framework for decentralized multi-agent systems operating in dynamic network environments. Integrated into a multi-agent reinforcement learning system, the framework is designed to enhance decision-making by optimizing the network's collective knowledge through efficient communication. Key contributions include adapting a static network packet-routing scenario to a dynamic setting with node failures, incorporating a graph attention network layer in a recurrent message-passing framework, and introducing a multi-round communication targeting mechanism. This approach enables an attention-based aggregation mechanism to be successfully trained within a sparse-reward, dynamic network packet-routing environment using only reinforcement learning. Experimental results show improvements in routing performance, including a 9.5 percent increase in average rewards and a 6.4 percent reduction in communication overhead compared to a baseline system. The study also examines the ethical and legal implications of deploying such systems in critical infrastructure and military contexts, identifies current limitations, and suggests potential directions for future research.
