Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

Qi Liu; Muhammad Shoaib; Mati Ur Rehman; Kaibin Bao; Veit Hagenmeyer; Wajih Ul Hassan

Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

Qi Liu, Muhammad Shoaib, Mati Ur Rehman, Kaibin Bao, Veit Hagenmeyer, Wajih Ul Hassan

TL;DR

This work addresses the challenge of detecting cyber persistence threats within Advanced Persistent Threats by reframing persistence as a two-phase process: a persistence setup and a subsequent persistence execution. It introduces Cyber Persistence Detector (CPD), which uses provenance analytics to connect these phases via pseudo-dependency edges and augments tracing with expert-guided edges, complemented by an alert triage mechanism. Empirical evaluation on public datasets and MITRE emulation plans shows CPD achieves substantial false-positive reduction (averaging 93%), accurate persistence attack graphs, and low runtime overhead. The approach provides explainable, scalable persistence detection that leverages MITRE ATT&CK semantics and is adaptable to enterprise logging infrastructures.

Abstract

In Advanced Persistent Threat (APT) attacks, achieving stealthy persistence within target systems is often crucial for an attacker's success. This persistence allows adversaries to maintain prolonged access, often evading detection mechanisms. Recognizing its pivotal role in the APT lifecycle, this paper introduces Cyber Persistence Detector (CPD), a novel system dedicated to detecting cyber persistence through provenance analytics. CPD is founded on the insight that persistent operations typically manifest in two phases: the "persistence setup" and the subsequent "persistence execution". By causally relating these phases, we enhance our ability to detect persistent threats. First, CPD discerns setups signaling an impending persistent threat and then traces processes linked to remote connections to identify persistence execution activities. A key feature of our system is the introduction of pseudo-dependency edges (pseudo-edges), which effectively connect these disjoint phases using data provenance analysis, and expert-guided edges, which enable faster tracing and reduced log size. These edges empower us to detect persistence threats accurately and efficiently. Moreover, we propose a novel alert triage algorithm that further reduces false positives associated with persistence threats. Evaluations conducted on well-known datasets demonstrate that our system reduces the average false positive rate by 93% compared to state-of-the-art methods.

Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

TL;DR

Abstract

Paper Structure (30 sections, 3 equations, 8 figures, 9 tables, 3 algorithms)

This paper contains 30 sections, 3 equations, 8 figures, 9 tables, 3 algorithms.

Introduction
Limitations of Existing PIDS
Limitations of Rule-based Persistence Detectors
Our Approach and Contributions
Motivation
APT Attack Stages
Persistence Prevalence
Why is Persistence Misunderstood in APT Detection?
Threat Model
System Design
Persistence Threat Detection
Expert-guided Edges
False Positive Reduction
Causality-based pseudo-edges
Correlation-based pseudo-edges
...and 15 more sections

Figures (8)

Figure 1: Stealthiness by persistence
Figure 2: CPD overview. CPD implements a four-step approach for detecting persistence threats, starting with the creation of a persistence setup table from audit logs that tracks potential setup actions. It then traces processes with remote connections to form sub-graphs, which are evaluated against execution rules and aligned with setup actions to form atomic graphs linked by a pseudo-edge. The process is refined through the introduction of pseudo-edge strength and a false positive reduction algorithm.
Figure 3: A persistence attack graph automatically generated by CPD on the EP-APT29-1 dataset. It uses rectangles for processes, ovals for files / Registry keys, and diamonds for network sockets. Annotations include S=Start, W=Write, C=Connect. The graph successfully pinpoints T1547.001 (Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder). The upper section reveals persistence setup: a malicious Microsoft Word-like program (.doc) starts, resulting in a Powershell instance and a shortcut creation in the Windows startup folder. This shortcut leads to another dropped malicious program, hostui.exe. The lower section, post-reboot, shows persistence execution: explorer.exe auto-executes startup folder shortcuts, triggering malicious Powershell code and connecting to the attacker. Indicative strings are bolded for clarity. CPD forms a pseudo-edge linking the process initiating persistence setup with the one managing the remote connection, i.e., the c2 agent.
Figure 4: An expert-guided edge is created during reconstruction of a T1543.003 persistence setup attack graph. An attacker-controlled malicious process leverages LOLBins to create a malicious service for persistence. The indicative Registry key is however modified by a Windows system process, to which no link from the malicious process can be built using logs from standard logging frameworks.
Figure 5: A false-positive persistence attack graph automatically generated by CPD on the EP-APT29-1 dataset. This graph wrongly classifies an instance of T1547.001. It turns out to be a benign program, i.e., Microsoft OneDrive, leveraging Registry run keys for updates. It in fact connects back to an IP address belonging to Microsoft Corporation.
...and 3 more figures

Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

TL;DR

Abstract

Accurate and Scalable Detection and Investigation of Cyber Persistence Threats

Authors

TL;DR

Abstract

Table of Contents

Figures (8)