Convergence Guarantees for Federated SARSA with Local Training and Heterogeneous Agents
Paul Mangold, Eloïse Berthier, Eric Moulines
TL;DR
This work provides the first finite-time convergence guarantees for FedSARSA in a heterogeneous federated reinforcement learning setting with local updates and linear function approximation. It introduces an exact multi-step error expansion for single-agent SARSA and extends it to FedSARSA by analyzing a unique limit point θ_* defined by a federated TD fixed-point equation, showing linear speed-up in the number of agents. The results quantify how transition and reward heterogeneity induce bias, detail the role of Markovian noise, and establish explicit sample/communication complexities. Numerical experiments corroborate the theory, illustrating linear speed-up and the impact of local-update bias in heterogeneous environments.
Abstract
We present a novel theoretical analysis of Federated SARSA (FedSARSA) with linear function approximation and local training. We establish convergence guarantees for FedSARSA in the presence of heterogeneity, both in local transitions and rewards, providing the first sample and communication complexity bounds in this setting. At the core of our analysis is a new, exact multi-step error expansion for single-agent SARSA, which is of independent interest. Our analysis precisely quantifies the impact of heterogeneity, demonstrating the convergence of FedSARSA with multiple local updates. Crucially, we show that FedSARSA achieves linear speed-up with respect to the number of agents, up to higher-order terms due to Markovian sampling. Numerical experiments support our theoretical findings.
