Table of Contents
Fetching ...

Distributed Detection of Adversarial Attacks in Multi-Agent Reinforcement Learning with Continuous Action Space

Kiarash Kazari, Ezzeldin Shereen, György Dán

TL;DR

This work tackles the challenge of detecting adversarial attacks in cooperative multi-agent reinforcement learning with continuous action spaces using a decentralized detector. It learns action distributions of nearby agents as parametric multivariate Gaussians via NET^{ij}, then computes a normality score z_t^{ij} and applies a two-sided CUSUM to detect mean shifts in real time. The approach, termed Parameterized Gaussian CUSUM (PGC), is evaluated on four PettingZoo environments against diverse attack strategies, consistently outperforming discrete-action baselines while offering lower computational complexity; the authors also show the importance of non-diagonal covariance, potential gains from parameter sharing, and effectiveness with multiple victims. The results support the method's potential for real-time situational awareness and resilience in safety-critical MARL deployments, enabling timely isolation or mitigation of compromised agents.

Abstract

We address the problem of detecting adversarial attacks against cooperative multi-agent reinforcement learning with continuous action space. We propose a decentralized detector that relies solely on the local observations of the agents and makes use of a statistical characterization of the normal behavior of observable agents. The proposed detector utilizes deep neural networks to approximate the normal behavior of agents as parametric multivariate Gaussian distributions. Based on the predicted density functions, we define a normality score and provide a characterization of its mean and variance. This characterization allows us to employ a two-sided CUSUM procedure for detecting deviations of the normality score from its mean, serving as a detector of anomalous behavior in real-time. We evaluate our scheme on various multi-agent PettingZoo benchmarks against different state-of-the-art attack methods, and our results demonstrate the effectiveness of our method in detecting impactful adversarial attacks. Particularly, it outperforms the discrete counterpart by achieving AUC-ROC scores of over 0.95 against the most impactful attacks in all evaluated environments.

Distributed Detection of Adversarial Attacks in Multi-Agent Reinforcement Learning with Continuous Action Space

TL;DR

This work tackles the challenge of detecting adversarial attacks in cooperative multi-agent reinforcement learning with continuous action spaces using a decentralized detector. It learns action distributions of nearby agents as parametric multivariate Gaussians via NET^{ij}, then computes a normality score z_t^{ij} and applies a two-sided CUSUM to detect mean shifts in real time. The approach, termed Parameterized Gaussian CUSUM (PGC), is evaluated on four PettingZoo environments against diverse attack strategies, consistently outperforming discrete-action baselines while offering lower computational complexity; the authors also show the importance of non-diagonal covariance, potential gains from parameter sharing, and effectiveness with multiple victims. The results support the method's potential for real-time situational awareness and resilience in safety-critical MARL deployments, enabling timely isolation or mitigation of compromised agents.

Abstract

We address the problem of detecting adversarial attacks against cooperative multi-agent reinforcement learning with continuous action space. We propose a decentralized detector that relies solely on the local observations of the agents and makes use of a statistical characterization of the normal behavior of observable agents. The proposed detector utilizes deep neural networks to approximate the normal behavior of agents as parametric multivariate Gaussian distributions. Based on the predicted density functions, we define a normality score and provide a characterization of its mean and variance. This characterization allows us to employ a two-sided CUSUM procedure for detecting deviations of the normality score from its mean, serving as a detector of anomalous behavior in real-time. We evaluate our scheme on various multi-agent PettingZoo benchmarks against different state-of-the-art attack methods, and our results demonstrate the effectiveness of our method in detecting impactful adversarial attacks. Particularly, it outperforms the discrete counterpart by achieving AUC-ROC scores of over 0.95 against the most impactful attacks in all evaluated environments.

Paper Structure

This paper contains 40 sections, 1 theorem, 11 equations, 7 figures, 12 tables.

Key Result

Proposition 1

Consider agents $i,j\in\mathcal{K}$ at time $t$, where $j\in\mathcal{K}^i$, and assume $a_t^j|\tau_t^i\sim \mathcal{N}(\mu_t^{ij},\boldsymbol{\Sigma}_t^{ij})$. Then the first and second moments of $z_t^{ij}$ are

Figures (7)

  • Figure 1: System model with agent $j$ as victim.
  • Figure 2: Proposed detection scheme for agent $i$ as the observer and agent $j$ as the potential victim.
  • Figure 3: Structure of $\text{NET}^{ij}$ used for prediction.
  • Figure 4: ROC of the proposed method against different attacks in all environments.
  • Figure 5: An example of one dimensional multi-modal distribution
  • ...and 2 more figures

Theorems & Definitions (2)

  • Proposition 1
  • proof