Table of Contents
Fetching ...

Provably Robust Federated Reinforcement Learning

Minghong Fang, Xilong Wang, Neil Zhenqiang Gong

TL;DR

The paper tackles poisoning in federated reinforcement learning by introducing a Normalized attack that maximizes angular deviation between pre- and post-attack aggregated updates, challenging existing Byzantine-robust defenses. It then proposes an ensemble FRL defense that trains multiple global policies over disjoint agent groups and combines their test-time actions via majority voting for discrete actions or the geometric median for continuous actions, with formal guarantees under certain threat thresholds. Empirical results across Cart Pole, Lunar Lander, and Inverted Pendulum show the Normalized attack can significantly disrupt non-ensemble, robust FRL rules, while the ensemble approach preserves near non-attack performance and provides strong resilience against both existing and the new attack. The work advances FRL security by linking a novel attack to a practical, provably robust ensemble defense, with implications for safer multi-agent learning in privacy-preserving, distributed environments.

Abstract

Federated reinforcement learning (FRL) allows agents to jointly learn a global decision-making policy under the guidance of a central server. While FRL has advantages, its decentralized design makes it prone to poisoning attacks. To mitigate this, Byzantine-robust aggregation techniques tailored for FRL have been introduced. Yet, in our work, we reveal that these current Byzantine-robust techniques are not immune to our newly introduced Normalized attack. Distinct from previous attacks that targeted enlarging the distance of policy updates before and after an attack, our Normalized attack emphasizes on maximizing the angle of deviation between these updates. To counter these threats, we develop an ensemble FRL approach that is provably secure against both known and our newly proposed attacks. Our ensemble method involves training multiple global policies, where each is learnt by a group of agents using any foundational aggregation rule. These well-trained global policies then individually predict the action for a specific test state. The ultimate action is chosen based on a majority vote for discrete action systems or the geometric median for continuous ones. Our experimental results across different settings show that the Normalized attack can greatly disrupt non-ensemble Byzantine-robust methods, and our ensemble approach offers substantial resistance against poisoning attacks.

Provably Robust Federated Reinforcement Learning

TL;DR

The paper tackles poisoning in federated reinforcement learning by introducing a Normalized attack that maximizes angular deviation between pre- and post-attack aggregated updates, challenging existing Byzantine-robust defenses. It then proposes an ensemble FRL defense that trains multiple global policies over disjoint agent groups and combines their test-time actions via majority voting for discrete actions or the geometric median for continuous actions, with formal guarantees under certain threat thresholds. Empirical results across Cart Pole, Lunar Lander, and Inverted Pendulum show the Normalized attack can significantly disrupt non-ensemble, robust FRL rules, while the ensemble approach preserves near non-attack performance and provides strong resilience against both existing and the new attack. The work advances FRL security by linking a novel attack to a practical, provably robust ensemble defense, with implications for safer multi-agent learning in privacy-preserving, distributed environments.

Abstract

Federated reinforcement learning (FRL) allows agents to jointly learn a global decision-making policy under the guidance of a central server. While FRL has advantages, its decentralized design makes it prone to poisoning attacks. To mitigate this, Byzantine-robust aggregation techniques tailored for FRL have been introduced. Yet, in our work, we reveal that these current Byzantine-robust techniques are not immune to our newly introduced Normalized attack. Distinct from previous attacks that targeted enlarging the distance of policy updates before and after an attack, our Normalized attack emphasizes on maximizing the angle of deviation between these updates. To counter these threats, we develop an ensemble FRL approach that is provably secure against both known and our newly proposed attacks. Our ensemble method involves training multiple global policies, where each is learnt by a group of agents using any foundational aggregation rule. These well-trained global policies then individually predict the action for a specific test state. The ultimate action is chosen based on a majority vote for discrete action systems or the geometric median for continuous ones. Our experimental results across different settings show that the Normalized attack can greatly disrupt non-ensemble Byzantine-robust methods, and our ensemble approach offers substantial resistance against poisoning attacks.

Paper Structure

This paper contains 32 sections, 3 theorems, 19 equations, 16 figures, 6 tables, 2 algorithms.

Key Result

Theorem 1

Consider an FRL system with $n$ agents and a test state $s$, where the action space $\mathcal{A}$ is discrete. The agents are divided into $K$ non-overlapping groups based on the hash values of their IDs, and each group trains its global policy using an aggregation rule $\text{AR}$. Define actions $ where $v(s, x)$ and $v(s, y)$ represent the pre-attack frequencies of actions $x$ and $y$ for state

Figures (16)

  • Figure 1: Illustration of the effects of our Normalized attack. $\bm{\theta}^1$ is the initial global policy, $\bm{\theta}^*$ is a local optimum.
  • Figure 2: Illustration of our ensemble framework with discrete action space.
  • Figure 3: Results on Cart Pole dataset.
  • Figure 4: Impact of the fraction of malicious agents on our ensemble method, where the Cart Pole dataset is considered.
  • Figure 5: Different variants of our Normalized attack, where the Cart Pole dataset is considered.
  • ...and 11 more figures

Theorems & Definitions (4)

  • Theorem 1: Discrete Action Space
  • Theorem 2: Continuous Action Space
  • Remark
  • Lemma 1