Table of Contents
Fetching ...

Graph Neural Networks based Log Anomaly Detection and Explanation

Zhong Li, Jiayang Shi, Matthijs van Leeuwen

TL;DR

This work tackles log anomaly detection by reframing logs as attributed, directed, and weighted graphs and solving graph-level anomaly detection with an end-to-end graph neural network. It introduces OCDiGCN, a one-class digraph inception convolutional network that learns graph representations and detects anomalies via a Deep SVDD objective, while also decomposing scores to provide node-level explanations. Logs2Graphs combines log parsing, grouping, graph construction, and OCDiGCN into a single pipeline, enabling both accurate detection and interpretable root-cause cues. Empirical results on five benchmarks show Logs2Graphs achieving strong, often state-of-the-art performance, with directed graphs yielding advantages over undirected representations and semantic node attributes enhancing accuracy; the method also offers practical anomaly explanations through node contributions and subgraph visualizations.

Abstract

Event logs are widely used to record the status of high-tech systems, making log anomaly detection important for monitoring those systems. Most existing log anomaly detection methods take a log event count matrix or log event sequences as input, exploiting quantitative and/or sequential relationships between log events to detect anomalies. Unfortunately, only considering quantitative or sequential relationships may result in low detection accuracy. To alleviate this problem, we propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs, which first converts event logs into attributed, directed, and weighted graphs, and then leverages graph neural networks to perform graph-level anomaly detection. Specifically, we introduce One-Class Digraph Inception Convolutional Networks, abbreviated as OCDiGCN, a novel graph neural network model for detecting graph-level anomalies in a collection of attributed, directed, and weighted graphs. By coupling the graph representation and anomaly detection steps, OCDiGCN can learn a representation that is especially suited for anomaly detection, resulting in a high detection accuracy. Importantly, for each identified anomaly, we additionally provide a small subset of nodes that play a crucial role in OCDiGCN's prediction as explanations, which can offer valuable cues for subsequent root cause diagnosis. Experiments on five benchmark datasets show that Logs2Graphs performs at least on par with state-of-the-art log anomaly detection methods on simple datasets while largely outperforming state-of-the-art log anomaly detection methods on complicated datasets.

Graph Neural Networks based Log Anomaly Detection and Explanation

TL;DR

This work tackles log anomaly detection by reframing logs as attributed, directed, and weighted graphs and solving graph-level anomaly detection with an end-to-end graph neural network. It introduces OCDiGCN, a one-class digraph inception convolutional network that learns graph representations and detects anomalies via a Deep SVDD objective, while also decomposing scores to provide node-level explanations. Logs2Graphs combines log parsing, grouping, graph construction, and OCDiGCN into a single pipeline, enabling both accurate detection and interpretable root-cause cues. Empirical results on five benchmarks show Logs2Graphs achieving strong, often state-of-the-art performance, with directed graphs yielding advantages over undirected representations and semantic node attributes enhancing accuracy; the method also offers practical anomaly explanations through node contributions and subgraph visualizations.

Abstract

Event logs are widely used to record the status of high-tech systems, making log anomaly detection important for monitoring those systems. Most existing log anomaly detection methods take a log event count matrix or log event sequences as input, exploiting quantitative and/or sequential relationships between log events to detect anomalies. Unfortunately, only considering quantitative or sequential relationships may result in low detection accuracy. To alleviate this problem, we propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs, which first converts event logs into attributed, directed, and weighted graphs, and then leverages graph neural networks to perform graph-level anomaly detection. Specifically, we introduce One-Class Digraph Inception Convolutional Networks, abbreviated as OCDiGCN, a novel graph neural network model for detecting graph-level anomalies in a collection of attributed, directed, and weighted graphs. By coupling the graph representation and anomaly detection steps, OCDiGCN can learn a representation that is especially suited for anomaly detection, resulting in a high detection accuracy. Importantly, for each identified anomaly, we additionally provide a small subset of nodes that play a crucial role in OCDiGCN's prediction as explanations, which can offer valuable cues for subsequent root cause diagnosis. Experiments on five benchmark datasets show that Logs2Graphs performs at least on par with state-of-the-art log anomaly detection methods on simple datasets while largely outperforming state-of-the-art log anomaly detection methods on complicated datasets.
Paper Structure (27 sections, 8 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 8 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: The Logs2Graphs pipeline. We use attributed, directed, and weighted graphs for representing the log files with high expressiveness, and integrate representation learning and anomaly detection for accurate anomaly detection. We use off-the-shelf methods for log parsing, log grouping, and graph construction.
  • Figure 2: The construction of an attributed, directed, and edge-weighted graph from a group of log messages.
  • Figure 3: The comparative performance analysis of Logs2Graphs, measured by ROC AUC, demonstrating the distinction between utilizing node semantic attributes and node labels.
  • Figure 4: ROC AUC results of Logs2Graphs w.r.t. a wide range of contamination levels. Results are averaged over 10 runs. Particularly, HDFS contains only 3% anomalies and thus results at 5% and 10% are not available.
  • Figure 5: Synthetic generation of normal (10000) and structurally anomalous (200 each) graphs.
  • ...and 3 more figures