Table of Contents
Fetching ...

Unified Semantic Log Parsing and Causal Graph Construction for Attack Attribution

Zhuoran Tan, Christos Anagnostopoulos, Shameem P. Parambath, Jeremy Singer

TL;DR

UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from individual log sources in labeled log dataset, and leverages domain knowledge in threat hunting such as Points of Interest.

Abstract

Multi-source logs provide a comprehensive overview of ongoing system activities, allowing for in-depth analysis to detect potential threats. A practical approach for threat detection involves explicit extraction of entity triples (subject, action, object) towards building provenance graphs to facilitate the analysis of system behavior. However, current log parsing methods mainly focus on retrieving parameters and events from raw logs while approaches based on entity extraction are limited to processing a single type of log. To address these gaps, we contribute with a novel unified framework, coined UTLParser. UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from individual log sources in labeled log dataset. It leverages domain knowledge in threat hunting such as Points of Interest. We further explore log generation delays and provide interfaces for optimized temporal graph querying. Our experiments showcase that UTLParser overcomes drawbacks of other log parsing methods. Furthermore, UTLParser precisely extracts explicit causal threat information while being compatible with enormous downstream tasks.

Unified Semantic Log Parsing and Causal Graph Construction for Attack Attribution

TL;DR

UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from individual log sources in labeled log dataset, and leverages domain knowledge in threat hunting such as Points of Interest.

Abstract

Multi-source logs provide a comprehensive overview of ongoing system activities, allowing for in-depth analysis to detect potential threats. A practical approach for threat detection involves explicit extraction of entity triples (subject, action, object) towards building provenance graphs to facilitate the analysis of system behavior. However, current log parsing methods mainly focus on retrieving parameters and events from raw logs while approaches based on entity extraction are limited to processing a single type of log. To address these gaps, we contribute with a novel unified framework, coined UTLParser. UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from individual log sources in labeled log dataset. It leverages domain knowledge in threat hunting such as Points of Interest. We further explore log generation delays and provide interfaces for optimized temporal graph querying. Our experiments showcase that UTLParser overcomes drawbacks of other log parsing methods. Furthermore, UTLParser precisely extracts explicit causal threat information while being compatible with enormous downstream tasks.

Paper Structure

This paper contains 18 sections, 1 figure, 2 tables, 1 algorithm.

Figures (1)

  • Figure 1: Unified Log Parsing Framework (UTLParser). Steps: 1. Check log type; 2. Parsing logs; 3. Extraction of POIs; 4. Output with unified format; 5. Causality analysis upon semantic dependency; 6. Causal graph construction; 7. Fuse sub causal graphs; 8. Graph query via optimized timestamp.