DEFENDCLI: {Command-Line} Driven Attack Provenance Examination

Peilun Wu; Nan Sun; Nour Moustafa; Youyang Qu; Ming Ding

DEFENDCLI: {Command-Line} Driven Attack Provenance Examination

Peilun Wu, Nan Sun, Nour Moustafa, Youyang Qu, Ming Ding

TL;DR

DEFENDCLI targets limitations of provenance-based EDR by enabling command-line level analysis within attack provenance graphs, addressing interoperability, reliability, flexibility, and practicality. It introduces Attack-Clause Sketch with a refined, command-line–focused graph structure, hybrid node scoring, and interphase attack association via Leiden communities, combined with Attack-Evidence Awareness featuring Rule-Based Boosting, SimHash Ensemble, and InfoPath retrieval with GPT-powered reporting. The system uses a Retrieval-Augmented Generation (RAG) with Llama-2 to triage and explain alerts, prioritizing critical threats with contextual narrative. In evaluations on the DARPA E3 datasets and industrial real-time detection, DEFENDCLI achieves up to approximately $1.6\times$ precision improvements over state-of-the-art methods and up to $2.3\times$ improvements over leading research, while maintaining real-time performance through parallelization. These results demonstrate practical, scalable, high-precision attack provenance analysis that yields actionable insights for security teams.

Abstract

Endpoint Detection and Response (EDR) solutions embrace the method of attack provenance graph to discover unknown threats through system event correlation. However, this method still faces some unsolved problems in the fields of interoperability, reliability, flexibility, and practicability to deliver actionable results. Our research highlights the limitations of current solutions in detecting obfuscation, correlating attacks, identifying low-frequency events, and ensuring robust context awareness in relation to command-line activities. To address these challenges, we introduce DEFENDCLI, an innovative system leveraging provenance graphs that, for the first time, delves into command-line-level detection. By offering finer detection granularity, it addresses a gap in modern EDR systems that has been overlooked in previous research. Our solution improves the precision of the information representation by evaluating differentiation across three levels: unusual system process calls, suspicious command-line executions, and infrequent external network connections. This multi-level approach enables EDR systems to be more reliable in complex and dynamic environments. Our evaluation demonstrates that DEFENDCLI improves precision by approximately 1.6x compared to the state-of-the-art methods on the DARPA Engagement Series attack datasets. Extensive real-time industrial testing across various attack scenarios further validates its practical effectiveness. The results indicate that DEFENDCLI not only detects previously unknown attack instances, which are missed by other modern commercial solutions, but also achieves a 2.3x improvement in precision over the state-of-the-art research work.

DEFENDCLI: {Command-Line} Driven Attack Provenance Examination

TL;DR

Abstract

DEFENDCLI: {Command-Line} Driven Attack Provenance Examination

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)