Table of Contents
Fetching ...

A Factored MDP Approach To Moving Target Defense With Dynamic Threat Modeling and Cost Efficiency

Megha Bose, Praveen Paruchuri, Akshat Kumar

TL;DR

This paper addresses cyber defense under Moving Target Defense when attacker payoffs are unknown. It introduces Adaptive Threat-Aware Factored MDP (ATA-FMDP), coupling a dynamic Bayesian network of attacker responses with a factored MDP and solving via Approximate Linear Programming to produce scalable, switching-cost–aware policies. A theoretical negative result shows sublinear policy regret is impossible against adaptive adversaries when multiple configurations exist, while empirical evaluations in web-application and network domains demonstrate substantial gains over baselines. The approach enables robust adaptation to evolving threat landscapes by online updating attacker-type probabilities and integrating them into defender planning, offering practical impact for real-world MTD deployments.

Abstract

Moving Target Defense (MTD) has emerged as a proactive and dynamic framework to counteract evolving cyber threats. Traditional MTD approaches often rely on assumptions about the attackers knowledge and behavior. However, real-world scenarios are inherently more complex, with adaptive attackers and limited prior knowledge of their payoffs and intentions. This paper introduces a novel approach to MTD using a Markov Decision Process (MDP) model that does not rely on predefined attacker payoffs. Our framework integrates the attackers real-time responses into the defenders MDP using a dynamic Bayesian Network. By employing a factored MDP model, we provide a comprehensive and realistic system representation. We also incorporate incremental updates to an attack response predictor as new data emerges. This ensures an adaptive and robust defense mechanism. Additionally, we consider the costs of switching configurations in MTD, integrating them into the reward structure to balance execution and defense costs. We first highlight the challenges of the problem through a theoretical negative result on regret. However, empirical evaluations demonstrate the frameworks effectiveness in scenarios marked by high uncertainty and dynamically changing attack landscapes.

A Factored MDP Approach To Moving Target Defense With Dynamic Threat Modeling and Cost Efficiency

TL;DR

This paper addresses cyber defense under Moving Target Defense when attacker payoffs are unknown. It introduces Adaptive Threat-Aware Factored MDP (ATA-FMDP), coupling a dynamic Bayesian network of attacker responses with a factored MDP and solving via Approximate Linear Programming to produce scalable, switching-cost–aware policies. A theoretical negative result shows sublinear policy regret is impossible against adaptive adversaries when multiple configurations exist, while empirical evaluations in web-application and network domains demonstrate substantial gains over baselines. The approach enables robust adaptation to evolving threat landscapes by online updating attacker-type probabilities and integrating them into defender planning, offering practical impact for real-world MTD deployments.

Abstract

Moving Target Defense (MTD) has emerged as a proactive and dynamic framework to counteract evolving cyber threats. Traditional MTD approaches often rely on assumptions about the attackers knowledge and behavior. However, real-world scenarios are inherently more complex, with adaptive attackers and limited prior knowledge of their payoffs and intentions. This paper introduces a novel approach to MTD using a Markov Decision Process (MDP) model that does not rely on predefined attacker payoffs. Our framework integrates the attackers real-time responses into the defenders MDP using a dynamic Bayesian Network. By employing a factored MDP model, we provide a comprehensive and realistic system representation. We also incorporate incremental updates to an attack response predictor as new data emerges. This ensures an adaptive and robust defense mechanism. Additionally, we consider the costs of switching configurations in MTD, integrating them into the reward structure to balance execution and defense costs. We first highlight the challenges of the problem through a theoretical negative result on regret. However, empirical evaluations demonstrate the frameworks effectiveness in scenarios marked by high uncertainty and dynamically changing attack landscapes.
Paper Structure (25 sections, 1 theorem, 56 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 25 sections, 1 theorem, 56 equations, 12 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

For any MTD defense strategy on $n>1$ configurations, $\exists$ an adaptive adversary such that the defender's policy regret compared to the best static configuration in hindsight is $\Omega(T)$.

Figures (12)

  • Figure 1: Representation of attacker types showing the adaptive aspects that each attacker type can attack in configurations $s$ and $s'$. For example, here, attacker type $\tau_2$ can attack adaptive aspect $s^2$ of $s$ and $s'^2$ of $s'$.
  • Figure 2: Dynamic Bayesian Network of Defender-Attacker Interactions
  • Figure 3: The scheme of Algorithm \ref{['algorithm:ATA-MDP-algo']}
  • Figure 4: In the web application environment, configurations $C_1 = (PHP, MySQL)$, $C_2 = (Python, MySQL)$, $C_3 = (PHP, Postgres)$, $C_4 = (Python, Postgres)$
  • Figure 5: Web Application Environment - Evolving Attack Landscape where between timesteps $330$ and $660$, the $unknown$ attacker prevails. Graphs (a,b,c) show cumulative defender rewards for each defender strategy, while (d,e,f) show the evolution of defender rewards over the $1000$ timesteps. $\alpha$ refers to the relative weight given to switching costs.
  • ...and 7 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof