Table of Contents
Fetching ...

When to Deceive: A Cross-Layer Stackelberg Game Framework for Strategic Timing of Cyber Deception

Ya-Ting Yang, Quanyan Zhu

TL;DR

The paper addresses the challenge of timing cyber deception to counter adaptive attackers, proposing a bi-level Stackelberg framework that couples a lower tactical layer modeled as a one-sided information Markov game with an upper strategic layer that optimizes deception switching via a stopping-time policy. An integrated algorithm combines dynamic programming with belief-state updates to compute equilibrium strategies at the tactical layer and optimal switching policies at the strategic layer. Results from enterprise-network case studies show that strategically timed deception can substantially improve defender utility and reduce the risk of critical asset compromise compared to static or heuristic baselines. This approach offers a practical, resource-aware method for coordinating deception deployment across the attacker’s lifetime in real-world networks.

Abstract

Cyber deception is an emerging proactive defense strategy to counter increasingly sophisticated attacks such as Advanced Persistent Threats (APTs) by misleading and distracting attackers from critical assets. However, since deception techniques incur costs and may lose effectiveness over time, defenders must strategically time and select them to adapt to the dynamic system and the attacker's responses. In this study, we propose a Stackelberg game-based framework to design strategic timing for cyber deception: the lower tactical layer (follower) captures the evolving attacker-defender dynamics under a given deception through a one-sided information Markov game, while the upper strategic layer (leader) employs a stopping-time decision process to optimize the timing and selection of deception techniques. We also introduce a computational algorithm that integrates dynamic programming and belief-state updates to account for the attacker's adaptive behavior and limited deception resources. Numerical experiments validate the framework, showing that strategically timed deceptions can enhance the defender's expected utility and reduce the risk of asset compromise compared to baseline strategies.

When to Deceive: A Cross-Layer Stackelberg Game Framework for Strategic Timing of Cyber Deception

TL;DR

The paper addresses the challenge of timing cyber deception to counter adaptive attackers, proposing a bi-level Stackelberg framework that couples a lower tactical layer modeled as a one-sided information Markov game with an upper strategic layer that optimizes deception switching via a stopping-time policy. An integrated algorithm combines dynamic programming with belief-state updates to compute equilibrium strategies at the tactical layer and optimal switching policies at the strategic layer. Results from enterprise-network case studies show that strategically timed deception can substantially improve defender utility and reduce the risk of critical asset compromise compared to static or heuristic baselines. This approach offers a practical, resource-aware method for coordinating deception deployment across the attacker’s lifetime in real-world networks.

Abstract

Cyber deception is an emerging proactive defense strategy to counter increasingly sophisticated attacks such as Advanced Persistent Threats (APTs) by misleading and distracting attackers from critical assets. However, since deception techniques incur costs and may lose effectiveness over time, defenders must strategically time and select them to adapt to the dynamic system and the attacker's responses. In this study, we propose a Stackelberg game-based framework to design strategic timing for cyber deception: the lower tactical layer (follower) captures the evolving attacker-defender dynamics under a given deception through a one-sided information Markov game, while the upper strategic layer (leader) employs a stopping-time decision process to optimize the timing and selection of deception techniques. We also introduce a computational algorithm that integrates dynamic programming and belief-state updates to account for the attacker's adaptive behavior and limited deception resources. Numerical experiments validate the framework, showing that strategically timed deceptions can enhance the defender's expected utility and reduce the risk of asset compromise compared to baseline strategies.

Paper Structure

This paper contains 11 sections, 10 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: An example enterprise network and attack path. The path contains 5 steps (web server, site IT, site HR, site developer, and critical asset). The path starts from the web server, which is open to the external network, and the defender seeks to optimize the timing and selection of deception.
  • Figure 2: The results show the value at the initial stage for different budgeted switch times and varying attacker lifetimes. Three cases are analyzed for different initial game modes (modes 0, 1, and 2) when the budget is zero, while optimal switching is considered when the budget is non-zero.
  • Figure 3: The results for the total rewards (sum of immediate reward at each stage and the terminal reward) under different switching strategies. Here, $K=10$, and each scenario is evaluated over $200$ experiments.

Theorems & Definitions (3)

  • Definition 1: Tactical Layer Problem (TLP)
  • Definition 2: Strategic Layer Problem (SLP)
  • Definition 3: One-sided $\epsilon$-PBNE