Over-the-air Federated Policy Gradient
Huiwen Yang, Lingying Huang, Subhrakanti Dey, Ling Shi
TL;DR
This work introduces over-the-air federated policy gradient (OA-FPG) for scalable multi-agent RL, where agents transmit analog gradient updates over a shared wireless channel and a central controller updates the policy parameter $\boldsymbol{\theta}$ from the superposed signal $\boldsymbol{v}_k$. The authors prove $L$-smoothness of the objective under standard RL assumptions and establish convergence with a linear speedup in the number of agents $N$ under favorable channel statistics, providing explicit complexity bounds to reach an $\epsilon$-approximate stationary point. They further analyze the impact of channel noise and fading, deriving conditions under which the performance degrades gracefully, and validate the approach via simulations on OpenAI-like tasks with Rayleigh and Nakagami channels. The results indicate substantial communication efficiency gains for large-scale federated RL without sacrificing convergence properties, with potential extensions to fully decentralized and collaborative setups.
Abstract
In recent years, over-the-air aggregation has been widely considered in large-scale distributed learning, optimization, and sensing. In this paper, we propose the over-the-air federated policy gradient algorithm, where all agents simultaneously broadcast an analog signal carrying local information to a common wireless channel, and a central controller uses the received aggregated waveform to update the policy parameters. We investigate the effect of noise and channel distortion on the convergence of the proposed algorithm, and establish the complexities of communication and sampling for finding an $ε$-approximate stationary point. Finally, we present some simulation results to show the effectiveness of the algorithm.
