Table of Contents
Fetching ...

METANOIA: A Lifelong Intrusion Detection and Investigation System for Mitigating Concept Drift

Jie Ying, Tiantian Zhu, Aohan Zheng, Tieming Chen, Mingqi Lv, Yan Chen

TL;DR

METANOIA tackles concept drift in provenance-based intrusion detection by introducing lifelong anomaly detection through incremental learning. It offers four core mechanisms—pseudo-edges to combat forgetting, suspicious-state transfer to avoid learning malicious behaviors, path-level filtering for precise alerts, and mini-graphs to reconstruct attack scenarios—forming a cohesive lifelong detection-and-investigation framework. Evaluations on DARPA TC benchmarks show substantial gains over state-of-the-art baselines at window-level ($30\%$), graph-level ($54\%$), and node-level ($29\%$) precision, demonstrating effective mitigation of false positives due to drift. The approach provides a scalable, real-time-ready solution for continuous detection and thorough investigation in evolving enterprise environments.

Abstract

As Advanced Persistent Threat (APT) complexity increases, provenance data is increasingly used for detection. Anomaly-based systems are gaining attention due to their attack-knowledge-agnostic nature and ability to counter zero-day vulnerabilities. However, traditional detection paradigms, which train on offline, limited-size data, often overlook concept drift - unpredictable changes in streaming data distribution over time. This leads to high false positive rates. We propose incremental learning as a new paradigm to mitigate this issue. However, we identify FOUR CHALLENGES while integrating incremental learning as a new paradigm. First, the long-running incremental system must combat catastrophic forgetting (C1) and avoid learning malicious behaviors (C2). Then, the system needs to achieve precise alerts (C3) and reconstruct attack scenarios (C4). We present METANOIA, the first lifelong detection system that mitigates the high false positives due to concept drift. It connects pseudo edges to combat catastrophic forgetting, transfers suspicious states to avoid learning malicious behaviors, filters nodes at the path-level to achieve precise alerts, and constructs mini-graphs to reconstruct attack scenarios. Using state-of-the-art benchmarks, we demonstrate that METANOIA improves precision performance at the window-level, graph-level, and node-level by 30%, 54%, and 29%, respectively, compared to previous approaches.

METANOIA: A Lifelong Intrusion Detection and Investigation System for Mitigating Concept Drift

TL;DR

METANOIA tackles concept drift in provenance-based intrusion detection by introducing lifelong anomaly detection through incremental learning. It offers four core mechanisms—pseudo-edges to combat forgetting, suspicious-state transfer to avoid learning malicious behaviors, path-level filtering for precise alerts, and mini-graphs to reconstruct attack scenarios—forming a cohesive lifelong detection-and-investigation framework. Evaluations on DARPA TC benchmarks show substantial gains over state-of-the-art baselines at window-level (), graph-level (), and node-level () precision, demonstrating effective mitigation of false positives due to drift. The approach provides a scalable, real-time-ready solution for continuous detection and thorough investigation in evolving enterprise environments.

Abstract

As Advanced Persistent Threat (APT) complexity increases, provenance data is increasingly used for detection. Anomaly-based systems are gaining attention due to their attack-knowledge-agnostic nature and ability to counter zero-day vulnerabilities. However, traditional detection paradigms, which train on offline, limited-size data, often overlook concept drift - unpredictable changes in streaming data distribution over time. This leads to high false positive rates. We propose incremental learning as a new paradigm to mitigate this issue. However, we identify FOUR CHALLENGES while integrating incremental learning as a new paradigm. First, the long-running incremental system must combat catastrophic forgetting (C1) and avoid learning malicious behaviors (C2). Then, the system needs to achieve precise alerts (C3) and reconstruct attack scenarios (C4). We present METANOIA, the first lifelong detection system that mitigates the high false positives due to concept drift. It connects pseudo edges to combat catastrophic forgetting, transfers suspicious states to avoid learning malicious behaviors, filters nodes at the path-level to achieve precise alerts, and constructs mini-graphs to reconstruct attack scenarios. Using state-of-the-art benchmarks, we demonstrate that METANOIA improves precision performance at the window-level, graph-level, and node-level by 30%, 54%, and 29%, respectively, compared to previous approaches.
Paper Structure (36 sections, 4 equations, 5 figures, 11 tables, 1 algorithm)

This paper contains 36 sections, 4 equations, 5 figures, 11 tables, 1 algorithm.

Figures (5)

  • Figure 1: Motivating Example.
  • Figure 2: Overview of METANOIA's architecture.
  • Figure 3: Reconstructed Attack Scenario. The red dashed box indicates the window-level alerts generated by METANOIA, with the specific time shown in the bottom right corner. On the left are the true positive, and on the right are the false positive. Red nodes represent attack-related nodes, while white nodes represent non-attack-related nodes.
  • Figure 4: Growth Trend of RN Pool size over time.
  • Figure 5: Impact of decay factor $\beta$, event anomaly threshold $\sigma$, path-level scoring threshold $\delta$ and node suspicion threshold $\gamma$. The four metrics corresponding to each parameter are, from left to right, Precision, Recall, Accuracy, and F1-Score.