Table of Contents
Fetching ...

Fundamental Limits of Man-in-the-Middle Attack Detection in Model-Free Reinforcement Learning

Rishi Rani, Massimo Franceschetti

Abstract

We consider the problem of learning-based man-in-the-middle (MITM) attacks in cyber-physical systems (CPS), and extend our previously proposed Bellman Deviation Detection (BDD) framework for model-free reinforcement learning (RL). We refine the standard MDP attack model by allowing the reward function to depend on both the current and subsequent states, thereby capturing reward variations induced by errors in the adversary's transition estimate. We also derive an optimal system-identification strategy for the adversary that minimizes detectable value deviations. Further, we prove that the agent's asymptotic learning time required to secure the system scales linearly with the adversary's learning time, and that this matches the optimal lower bound. Hence, the proposed detection scheme is order-optimal in detection efficiency. Finally, we extend the framework to asynchronous and intermittent attack scenarios, where reliable detection is preserved.

Fundamental Limits of Man-in-the-Middle Attack Detection in Model-Free Reinforcement Learning

Abstract

We consider the problem of learning-based man-in-the-middle (MITM) attacks in cyber-physical systems (CPS), and extend our previously proposed Bellman Deviation Detection (BDD) framework for model-free reinforcement learning (RL). We refine the standard MDP attack model by allowing the reward function to depend on both the current and subsequent states, thereby capturing reward variations induced by errors in the adversary's transition estimate. We also derive an optimal system-identification strategy for the adversary that minimizes detectable value deviations. Further, we prove that the agent's asymptotic learning time required to secure the system scales linearly with the adversary's learning time, and that this matches the optimal lower bound. Hence, the proposed detection scheme is order-optimal in detection efficiency. Finally, we extend the framework to asynchronous and intermittent attack scenarios, where reliable detection is preserved.

Paper Structure

This paper contains 30 sections, 12 theorems, 64 equations, 3 figures, 2 algorithms.

Key Result

Theorem 1

In the absence of attacks, the BDPMs converge asymptotically as,

Figures (3)

  • Figure 1: (a) Adversary Learning Phase: During this phase, the attacker eavesdrops and learns the system dynamics without altering the feedback signal to the agent. (b) Adversary Attack Phase: During this phase, the attacker intercepts the feedback loop and provides a falsified signal to the agent to induce a target policy or cause value deviation.
  • Figure 2: This figure shows the absolute value of the maximum BDPM for the cases when an attack occurs (in orange) and when the system is secure (in blue). The security bound (in dotted blue) upper bounds the BDPM when no attack occurs and lower bounds the BDPM when attack occurs, if the information advantage condition is met.
  • Figure 3: This figure shows the probability of the agent detecting an attack, calculated based on 1000 trials for varying amounts of agent and adversary learning times.

Theorems & Definitions (19)

  • Definition 1: Bellman Deviation Process
  • Definition 2: Bellman Deviation Process Mean
  • Definition 3: Minimum Value Drift
  • Remark 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Definition 4: Asymptotic Bellman Deviation Process Gap
  • ...and 9 more