Cyber Key Terrain Identification Using Adjusted PageRank Centrality

Lukáš Sadlek; Pavel Čeleda

Cyber Key Terrain Identification Using Adjusted PageRank Centrality

Lukáš Sadlek, Pavel Čeleda

TL;DR

This work addresses identifying cyber key terrain by translating network position into centrality scores using PageRank, but augments the standard approach with per-edge damping factors $d_{uv}$ learned from port-pair interactions. The authors employ hill climbing and random walk to optimize these factors during a one-time learning phase on static graphs, then apply a streaming PageRank computation to IP-flow data, with unseen edges defaulting to $0.85$. Across cyber defense and campus-network datasets, the adjusted centrality method achieves higher $F1$ performance than the traditional PageRank and demonstrates near-real-time processing on large IP-flow streams. The approach provides a scalable, flow-aware mechanism to prioritize cyber assets for defense, while acknowledging temporal fluctuations, data-labeling challenges, and memory considerations as future work.

Abstract

The cyber terrain contains devices, network services, cyber personas, and other network entities involved in network operations. Designing a method that automatically identifies key network entities to network operations is challenging. However, such a method is essential for determining which cyber assets should the cyber defense focus on. In this paper, we propose an approach for the classification of IP addresses belonging to cyber key terrain according to their network position using the PageRank centrality computation adjusted by machine learning. We used hill climbing and random walk algorithms to distinguish PageRank's damping factors based on source and destination ports captured in IP flows. The one-time learning phase on a static data sample allows near-real-time stream-based classification of key hosts from IP flow data in operational conditions without maintaining a complete network graph. We evaluated the approach on a dataset from a cyber defense exercise and on data from the campus network. The results show that cyber key terrain identification using the adjusted computation of centrality is more precise than its original version.

Cyber Key Terrain Identification Using Adjusted PageRank Centrality

TL;DR

This work addresses identifying cyber key terrain by translating network position into centrality scores using PageRank, but augments the standard approach with per-edge damping factors

learned from port-pair interactions. The authors employ hill climbing and random walk to optimize these factors during a one-time learning phase on static graphs, then apply a streaming PageRank computation to IP-flow data, with unseen edges defaulting to

. Across cyber defense and campus-network datasets, the adjusted centrality method achieves higher

performance than the traditional PageRank and demonstrates near-real-time processing on large IP-flow streams. The approach provides a scalable, flow-aware mechanism to prioritize cyber assets for defense, while acknowledging temporal fluctuations, data-labeling challenges, and memory considerations as future work.

Abstract

Paper Structure (14 sections, 4 equations, 1 figure, 4 tables, 1 algorithm)

This paper contains 14 sections, 4 equations, 1 figure, 4 tables, 1 algorithm.

Introduction
Related Work
Cyber Key Terrain Mapping
Network Centrality Measures
Hill Climbing and Random Walk
IP Flow
Method for Cyber Key Terrain Identification
Learning Phase
Computation Phase
Evaluation
Dataset from Cyber Defense Exercise
Dataset from the Campus Network
Limitations
Conclusion

Figures (1)

Figure 1: Line graphs containing F1 scores for heuristics and the PageRank with default damping factor divided according to six team networks.

Cyber Key Terrain Identification Using Adjusted PageRank Centrality

TL;DR

Abstract

Cyber Key Terrain Identification Using Adjusted PageRank Centrality

Authors

TL;DR

Abstract

Table of Contents

Figures (1)