Table of Contents
Fetching ...

LTRDetector: Exploring Long-Term Relationship for Advanced Persistent Threats Detection

Xiaoxiao Liu, Fan Xu, Nan Wang, Qinxin Zhao, Dalin Zhang, Xibin Zhao, Jiqiang Liu

TL;DR

This paper tackles the detection of Advanced Persistent Threats by exploiting provenance graphs to capture long-term, low-frequency attacker behavior without relying on predefined signatures. It introduces LTRDetector, a three-stage framework that first embeds provenance graphs via BFS walks and Word2Vec embeddings with causality-preserving compression, then leverages a Transformer encoder–decoder to extract long-term features, and finally uses K-means clustering on normal data to identify anomalies. The approach achieves superior detection performance across five public datasets, demonstrating robustness to zero-day attacks and persistence that confounds traditional methods. The findings suggest practical impact for enterprise security by enabling real-time, unsupervised APT detection with scalable graph-based representations and rich context modeling.

Abstract

Advanced Persistent Threat (APT) is challenging to detect due to prolonged duration, infrequent occurrence, and adept concealment techniques. Existing approaches primarily concentrate on the observable traits of attack behaviors, neglecting the intricate relationships formed throughout the persistent attack lifecycle. Thus, we present an innovative APT detection framework named LTRDetector, implementing an end-to-end holistic operation. LTRDetector employs an innovative graph embedding technique to retain comprehensive contextual information, then derives long-term features from these embedded provenance graphs. During the process, we compress the data of the system provenance graph for effective feature learning. Furthermore, in order to detect attacks conducted by using zero-day exploits, we captured the system's regular behavior and detects abnormal activities without relying on predefined attack signatures. We also conducted extensive evaluations using five prominent datasets, the efficacy evaluation of which underscores the superiority of LTRDetector compared to existing state-of-the-art techniques.

LTRDetector: Exploring Long-Term Relationship for Advanced Persistent Threats Detection

TL;DR

This paper tackles the detection of Advanced Persistent Threats by exploiting provenance graphs to capture long-term, low-frequency attacker behavior without relying on predefined signatures. It introduces LTRDetector, a three-stage framework that first embeds provenance graphs via BFS walks and Word2Vec embeddings with causality-preserving compression, then leverages a Transformer encoder–decoder to extract long-term features, and finally uses K-means clustering on normal data to identify anomalies. The approach achieves superior detection performance across five public datasets, demonstrating robustness to zero-day attacks and persistence that confounds traditional methods. The findings suggest practical impact for enterprise security by enabling real-time, unsupervised APT detection with scalable graph-based representations and rich context modeling.

Abstract

Advanced Persistent Threat (APT) is challenging to detect due to prolonged duration, infrequent occurrence, and adept concealment techniques. Existing approaches primarily concentrate on the observable traits of attack behaviors, neglecting the intricate relationships formed throughout the persistent attack lifecycle. Thus, we present an innovative APT detection framework named LTRDetector, implementing an end-to-end holistic operation. LTRDetector employs an innovative graph embedding technique to retain comprehensive contextual information, then derives long-term features from these embedded provenance graphs. During the process, we compress the data of the system provenance graph for effective feature learning. Furthermore, in order to detect attacks conducted by using zero-day exploits, we captured the system's regular behavior and detects abnormal activities without relying on predefined attack signatures. We also conducted extensive evaluations using five prominent datasets, the efficacy evaluation of which underscores the superiority of LTRDetector compared to existing state-of-the-art techniques.
Paper Structure (22 sections, 10 equations, 8 figures, 6 tables)

This paper contains 22 sections, 10 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: The framework of LTRDetector.
  • Figure 2: An example of the BFS and DFS random walk.
  • Figure 3: An example of low-frequency anomalous behaviors can accumulate continuously and reflect significant differences from normal behaviors.
  • Figure 4: Long-Term features extraction.
  • Figure 5: The Long-Term features extraction model. The input {${X^1,X^2,\ldots,X^n}$} is the feature sequence of a provenance graph generated by Word2Vec in the precivious step. The input {${f_1,f_2,\ldots,f_n}$} is the feature vector of a provenance graph generated by the Encoder.
  • ...and 3 more figures