Fully Distributed Online Training of Graph Neural Networks in Networked Systems
Rostyslav Olshevskyi, Zhongyuan Zhao, Kevin Chan, Gunjan Verma, Ananthram Swami, Santiago Segarra
TL;DR
This work introduces a fully distributed online training framework for graph neural networks tailored to networked systems, addressing the limitations of centralized training. It reformulates GCNN training as a distributed optimization problem and provides a three-part solution: fully distributed backpropagation to estimate local gradients, distributed SGD (with consensus-based gradient aggregation), and communication-efficient mini-batch strategies including piggybacking and information reuse. The approach enables training across supervised, unsupervised, and reinforcement learning pipelines, with numerical results showing near-centralized performance and practical gains in tasks like node regression, UWMMSE power allocation, and wireless link scheduling. The work lays groundwork for scalable, adaptive intelligent networks, with future directions including convergence proofs and robustness to real-world communication constraints.
Abstract
Graph neural networks (GNNs) are powerful tools for developing scalable, decentralized artificial intelligence in large-scale networked systems, such as wireless networks, power grids, and transportation networks. Currently, GNNs in networked systems mostly follow a paradigm of `centralized training, distributed execution', which limits their adaptability and slows down their development cycles. In this work, we fill this gap for the first time by developing a communication-efficient, fully distributed online training approach for GNNs applied to large networked systems. For a mini-batch with $B$ samples, our approach of training an $L$-layer GNN only adds $L$ rounds of message passing to the $LB$ rounds required by GNN inference, with doubled message sizes. Through numerical experiments in graph-based node regression, power allocation, and link scheduling in wireless networks, we demonstrate the effectiveness of our approach in training GNNs under supervised, unsupervised, and reinforcement learning paradigms.
