Table of Contents
Fetching ...

Scalable Learning of Intrusion Responses through Recursive Decomposition

Kim Hammar, Rolf Stadler

TL;DR

The paper tackles scalable automated intrusion response by modeling defender-attacker interaction as a two-player po-posg and introducing recursive graph-based decomposition. It proves that the game decomposes into independent workflow and node subgames with optimal substructure and, using optimal stopping theory, shows defender best responses exhibit threshold switching curves. The authors develop Decompositional Fictitious Self-Play (DFSP) to learn Nash equilibria across the decomposed subgames and validate the approach with a digital twin-based emulation of a realistic infrastructure, showing DFSP achieves near-equilibrium performance and outperforms state-of-the-art baselines. This work offers a scalable, principled framework for automatic intrusion response with practical validation on a high-fidelity digital twin, enabling deployment in large-scale IT environments.

Abstract

We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that DFSP significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.

Scalable Learning of Intrusion Responses through Recursive Decomposition

TL;DR

The paper tackles scalable automated intrusion response by modeling defender-attacker interaction as a two-player po-posg and introducing recursive graph-based decomposition. It proves that the game decomposes into independent workflow and node subgames with optimal substructure and, using optimal stopping theory, shows defender best responses exhibit threshold switching curves. The authors develop Decompositional Fictitious Self-Play (DFSP) to learn Nash equilibria across the decomposed subgames and validate the approach with a digital twin-based emulation of a realistic infrastructure, showing DFSP achieves near-equilibrium performance and outperforms state-of-the-art baselines. This work offers a scalable, principled framework for automatic intrusion response with practical validation on a high-fidelity digital twin, enabling deployment in large-scale IT environments.

Abstract

We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that DFSP significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.
Paper Structure (29 sections, 2 theorems, 24 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 29 sections, 2 theorems, 24 equations, 11 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

$\quad$

Figures (11)

  • Figure 1: The target infrastructure and the actors involved in the intrusion response use case.
  • Figure 2: Our framework for finding and evaluating intrusion response strategies hammar_stadler_tnsmhammar_stadler_game_23csle_docs.
  • Figure 3: Attacker actions: (i) reconnaissance actions; (ii) brute-force attacks; and (iii) code execution attacks.
  • Figure 4: Defender actions: (i) migrate a node between two zones; (ii) redirect or block traffic flows to a node; (iii) shut down a node; and (iv) revoke access to a node.
  • Figure 5: Dependency graph of a workflow consisting of a tree of virtual network functions and microservices; fw, lb, and idps are acronyms for firewall, load balancer, and intrusion detection and prevention system, respectively.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • Theorem 2: Decomposition theorem