Single-Loop Federated Actor-Critic across Heterogeneous Environments

Ye Zhu; Xiaowen Gong

Single-Loop Federated Actor-Critic across Heterogeneous Environments

Ye Zhu, Xiaowen Gong

TL;DR

SFAC introduces a two-level federated actor-critic framework to learn a single global policy across heterogeneous environments. It decomposes learning into FedC for federated TD-based critic evaluation and FedA for federated policy improvement, operating under a mixture environment to reflect agent heterogeneity. The paper proves finite-time convergence to a near-stationary point, with the convergence error scaling with environment heterogeneity and a linear speedup in sample complexity as the number of agents, $N$, increases. Empirical results on standard RL benchmarks demonstrate improved performance and faster convergence compared to baselines.

Abstract

Federated reinforcement learning (FRL) has emerged as a promising paradigm, enabling multiple agents to collaborate and learn a shared policy adaptable across heterogeneous environments. Among the various reinforcement learning (RL) algorithms, the actor-critic (AC) algorithm stands out for its low variance and high sample efficiency. However, little to nothing is known theoretically about AC in a federated manner, especially each agent interacts with a potentially different environment. The lack of such results is attributed to various technical challenges: a two-level structure illustrating the coupling effect between the actor and the critic, heterogeneous environments, Markovian sampling and multiple local updates. In response, we study \textit{Single-loop Federated Actor Critic} (SFAC) where agents perform actor-critic learning in a two-level federated manner while interacting with heterogeneous environments. We then provide bounds on the convergence error of SFAC. The results show that the convergence error asymptotically converges to a near-stationary point, with the extent proportional to environment heterogeneity. Moreover, the sample complexity exhibits a linear speed-up through the federation of agents. We evaluate the performance of SFAC through numerical experiments using common RL benchmarks, which demonstrate its effectiveness.

Single-Loop Federated Actor-Critic across Heterogeneous Environments

TL;DR

Abstract

Single-Loop Federated Actor-Critic across Heterogeneous Environments

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (13)