Compressed Federated Reinforcement Learning with a Generative Model

Ali Beikmohammadi; Sarit Khirirat; Sindri Magnússon

Compressed Federated Reinforcement Learning with a Generative Model

Ali Beikmohammadi, Sarit Khirirat, Sindri Magnússon

TL;DR

CompFedRL is proposed, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression.

Abstract

Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.

Compressed Federated Reinforcement Learning with a Generative Model

TL;DR

Abstract

-learning with a generative model setup, where a central server learns an optimal

-function by periodically aggregating compressed

-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-

and Sparsified-

sparsification operators.

Paper Structure (39 sections, 9 theorems, 46 equations, 22 figures, 1 algorithm)

This paper contains 39 sections, 9 theorems, 46 equations, 22 figures, 1 algorithm.

Introduction
Contributions.
Related Work
Single-agent RL Algorithms
Distributed and Federated RL Algorithms
Communication-Efficient Learning Algorithms
Notation
Preliminaries and Background
Discounted Infinite-horizon MDP
Policy, and $Q$-function
Optimal Policy and Bellman Operator
RL with a Generative Model
CompFedRL
Compression Options for CompFedRL
Convergence Analysis
...and 24 more sections

Key Result

lemma thmcounterlemma

Let $\textsf{Compress}(v)$ be Sparsified-$K$ and $p_{\min} = \min_{j=[1,d]} p_j$. Then, $\textsf{Compress}(v)$ is UnbiasedComp with $q_2 = 1/p_{\min} -1$ and $q_\infty = \max(1/p_{\min} -1,1)$.

Figures (22)

Figure 1: Impact of compression: RMSE of the $Q$-estimates for both FedRLwoo2023blessingjin2022federated (without compression) and CompFedRL to solve Map$11\times11$ task. Here, $I= 50$, $\eta =0.01$, $\beta= 0.8$, and (Left) $K=1$; (Right) $K=10$.
Figure 2: Impact of the number of agents and local epochs: RMSE of the $Q$-estimates for CompFedRL under Top-50 sparsification to solve Map$11\times11$ task. Here, (Left) $T\times K = 10000$, $\eta =0.01$, and $\beta=1$; (Right) $T= 10000$, $K=1$, and $\beta=0.8$.
Figure 3: Impact of federated parameter: RMSE of the $Q$-estimates for CompFedRL under (Left) Top-50 and (Right) Sparsified-50 sparsification to solve Map$11\times11$ task. Here, $I= 50$, $T= 10000$, $K=1$, and $\eta =0.1$.
Figure 4: Impact of learning rate: RMSE of the $Q$-estimates for CompFedRL under (Left) Top-50 and (Right) Sparsified-50 sparsification to solve Map$11\times11$ task. Here, $I= 50$, $T= 10000$, $K=1$, and $\beta=0.8$.
Figure 5: Various noisy Grid-world environments utilized as experimental tasks, requiring navigation through a grid towards a Goal (green tile) state while avoiding collision with Walls (black tiles) by traversing Empty (white tiles) cells.
...and 17 more figures

Theorems & Definitions (13)

definition thmcounterdefinition: UnbiasedComp
lemma thmcounterlemma
definition thmcounterdefinition: BiasedComp
lemma thmcounterlemma
theorem thmcountertheorem
corollary thmcountercorollary
theorem thmcountertheorem
corollary thmcountercorollary
lemma thmcounterlemma
lemma thmcounterlemma
...and 3 more

Compressed Federated Reinforcement Learning with a Generative Model

TL;DR

Abstract

Compressed Federated Reinforcement Learning with a Generative Model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (22)

Theorems & Definitions (13)