Table of Contents
Fetching ...

Local Environment Poisoning Attacks on Federated Reinforcement Learning

Evelyn Ma, Praneet Rathi, S. Rasoul Etesami

TL;DR

This work examines how Federated Reinforcement Learning (FRL) systems can be compromised by local environment poisoning applied to a subset of agents. It introduces a general bi-level optimization framework and concrete poisoning protocols for policy-based FRL, including an adversary design with public-private critics in actor-critic settings and reward poisoning for policy-gradient methods. The authors prove a theoretical guarantee that, under certain conditions, the poisoned global objective decreases relative to the clean FRL, and they validate the approach with extensive experiments on standard OpenAI Gym tasks using VPG and PPO, showing significant degradation in global performance compared to baselines. They also discuss a defense mechanism based on per-agent credit scoring, highlighting the practical need for robust FRL algorithms as federated RL deployments scale.

Abstract

Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy. Despite the advantage brought by FL, the vulnerability of Federated Reinforcement Learning (FRL) has not been well-studied before. In this work, we propose a general framework to characterize FRL poisoning as an optimization problem and design a poisoning protocol that can be applied to policy-based FRL. Our framework can also be extended to FRL with actor-critic as a local RL algorithm by training a pair of private and public critics. We provably show that our method can strictly hurt the global objective. We verify our poisoning effectiveness by conducting extensive experiments targeting mainstream RL algorithms and over various RL OpenAI Gym environments covering a wide range of difficulty levels. Within these experiments, we compare clean and baseline poisoning methods against our proposed framework. The results show that the proposed framework is successful in poisoning FRL systems and reducing performance across various environments and does so more effectively than baseline methods. Our work provides new insights into the vulnerability of FL in RL training and poses new challenges for designing robust FRL algorithms

Local Environment Poisoning Attacks on Federated Reinforcement Learning

TL;DR

This work examines how Federated Reinforcement Learning (FRL) systems can be compromised by local environment poisoning applied to a subset of agents. It introduces a general bi-level optimization framework and concrete poisoning protocols for policy-based FRL, including an adversary design with public-private critics in actor-critic settings and reward poisoning for policy-gradient methods. The authors prove a theoretical guarantee that, under certain conditions, the poisoned global objective decreases relative to the clean FRL, and they validate the approach with extensive experiments on standard OpenAI Gym tasks using VPG and PPO, showing significant degradation in global performance compared to baselines. They also discuss a defense mechanism based on per-agent credit scoring, highlighting the practical need for robust FRL algorithms as federated RL deployments scale.

Abstract

Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy. Despite the advantage brought by FL, the vulnerability of Federated Reinforcement Learning (FRL) has not been well-studied before. In this work, we propose a general framework to characterize FRL poisoning as an optimization problem and design a poisoning protocol that can be applied to policy-based FRL. Our framework can also be extended to FRL with actor-critic as a local RL algorithm by training a pair of private and public critics. We provably show that our method can strictly hurt the global objective. We verify our poisoning effectiveness by conducting extensive experiments targeting mainstream RL algorithms and over various RL OpenAI Gym environments covering a wide range of difficulty levels. Within these experiments, we compare clean and baseline poisoning methods against our proposed framework. The results show that the proposed framework is successful in poisoning FRL systems and reducing performance across various environments and does so more effectively than baseline methods. Our work provides new insights into the vulnerability of FL in RL training and poses new challenges for designing robust FRL algorithms
Paper Structure (13 sections, 2 theorems, 8 equations, 7 figures, 3 tables, 3 algorithms)

This paper contains 13 sections, 2 theorems, 8 equations, 7 figures, 3 tables, 3 algorithms.

Key Result

Theorem 1

Let Assumptions asp:single, asp:fedavg, and asp:smooth hold. Suppose that all agents are updated cleanly at the first $p-1$ rounds, and at round $p$, agent $(n)$ is poisoned. Define a scalar ${\epsilon}_+ := \frac{2\lambda_\theta B}{n L_r }$, where $B$ is a scalar defined as Then, for $B>0$ and ${\epsilon} < {\epsilon}_+$, we have where $\alpha \in[0, \frac{{\epsilon}_+^2}{8}]$.

Figures (7)

  • Figure 1: Contrastive performance for VPG FRL system. We plot performance of local environment poisoning against two baselines of clean training and random attack. We only report the largest size of the system that the attacker can successfully poison, which are 4 for the left two plots and 3 for the right two.
  • Figure 2: Rewards given by a poisoned PPO system (attack budget ${\epsilon} = 1$) with a single attacker under the proposed method are significantly lower than the clean system of the same agent size and random attacks. The system size is three agents for InvertedPendulum and four agents for the others.
  • Figure 3: Target attack against VPG FRL. We choose environments of different action space: CartPole (two discrete actions), LunarLander (four discrete actions), InvertedPendulum (one continuous dimension), HalfCheetah (six continuous dimensions).
  • Figure 4: We train a two-agent FRL system and report results of the clean baseline, the poisoned model (budget ${\epsilon}$ = 1), and the poisoned model with a defense mechanism.
  • Figure 5: The two-critic PPO attack (labeled poison in the plot) is superior compared to single-critic (labeled critic in the plot).
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2: Theorem \ref{['thrm:UB']}-restated
  • proof