Table of Contents
Fetching ...

Extreme Value Modelling of Feature Residuals for Anomaly Detection in Dynamic Graphs

Sevvandi Kandanaarachchi, Conrad Sanderson, Rob J. Hyndman

TL;DR

This work tackles anomaly detection in sequences of dynamic graphs by combining a rich feature-based representation with ARIMA-based temporal modelling to obtain residuals, followed by robust dimensionality reduction and Extreme Value Theory (EVT) to model extremes. The approach directly handles variable-sized graphs and complex temporal changes, achieving significantly better accuracy than TensorSplat and Laplacian Anomaly Detection across multiple graph models (ER, BA, WS). By modelling residuals and focusing on low-density extremes with a Generalized Pareto Distribution, the method aims to reduce false positives while preserving detection power. The proposed pipeline offers a practical, scalable framework for detecting graph-level anomalies in domains such as transport, energy, and cyber networks, with clear avenues for explainability and subgraph extensions in future work.

Abstract

Detecting anomalies in a temporal sequence of graphs can be applied is areas such as the detection of accidents in transport networks and cyber attacks in computer networks. Existing methods for detecting abnormal graphs can suffer from multiple limitations, such as high false positive rates as well as difficulties with handling variable-sized graphs and non-trivial temporal dynamics. To address this, we propose a technique where temporal dependencies are explicitly modelled via time series analysis of a large set of pertinent graph features, followed by using residuals to remove the dependencies. Extreme Value Theory is then used to robustly model and classify any remaining extremes, aiming to produce low false positives rates. Comparative evaluations on a multitude of graph instances show that the proposed approach obtains considerably better accuracy than TensorSplat and Laplacian Anomaly Detection.

Extreme Value Modelling of Feature Residuals for Anomaly Detection in Dynamic Graphs

TL;DR

This work tackles anomaly detection in sequences of dynamic graphs by combining a rich feature-based representation with ARIMA-based temporal modelling to obtain residuals, followed by robust dimensionality reduction and Extreme Value Theory (EVT) to model extremes. The approach directly handles variable-sized graphs and complex temporal changes, achieving significantly better accuracy than TensorSplat and Laplacian Anomaly Detection across multiple graph models (ER, BA, WS). By modelling residuals and focusing on low-density extremes with a Generalized Pareto Distribution, the method aims to reduce false positives while preserving detection power. The proposed pipeline offers a practical, scalable framework for detecting graph-level anomalies in domains such as transport, energy, and cyber networks, with clear avenues for explainability and subgraph extensions in future work.

Abstract

Detecting anomalies in a temporal sequence of graphs can be applied is areas such as the detection of accidents in transport networks and cyber attacks in computer networks. Existing methods for detecting abnormal graphs can suffer from multiple limitations, such as high false positive rates as well as difficulties with handling variable-sized graphs and non-trivial temporal dynamics. To address this, we propose a technique where temporal dependencies are explicitly modelled via time series analysis of a large set of pertinent graph features, followed by using residuals to remove the dependencies. Extreme Value Theory is then used to robustly model and classify any remaining extremes, aiming to produce low false positives rates. Comparative evaluations on a multitude of graph instances show that the proposed approach obtains considerably better accuracy than TensorSplat and Laplacian Anomaly Detection.
Paper Structure (15 sections, 7 equations, 7 figures)

This paper contains 15 sections, 7 equations, 7 figures.

Figures (7)

  • Figure 1: Two examples of graph sequences with anomalies, with each sequence ordered left-to-right. (a) Graphs generated using the Erdős-Rényi random graph model frieze2015introduction; the first two graphs have an edge probability of 0.05, while the last graph is abnormal with an edge probability 0.20. (b) Graphs with 100 edges; the first two graphs have randomly selected edges, while the last graph is abnormal as all its edges are connected to a central vertex.
  • Figure 2: (a). Example of natural change in an arbitrary graph feature over time, with an anomaly at time = 60. The distribution of the features is shown on the y-axis (right of plot), with the bin corresponding to the anomaly marked in red. Without taking into account the temporal context, the anomaly is not discernible via inspecting the feature distribution. (b). Residuals from temporal modelling of data in (a). The corresponding distribution is given on the y-axis (right of plot), where the anomaly is clearly discernible.
  • Figure 3: Example graphs generated by: (a) Erdős-Rényi model frieze2015introduction, (b) Barabási-Albert model albert2002statistical, (c) Watts-Strogatz model Watts1998.
  • Figure 4: Results for experiment 1. Performance is shown in terms of boxplots that summarise the distribution of AUC values. Higher AUC values indicate higher accuracy. The results are obtained from 10 time series of 100 graphs, with each graph comprised of 100 vertices with edge probability $p \hbox{=} 0.05$, generated according to the Erdős-Rényi model frieze2015introduction. In each time series, an abnormal graph is placed at $t \hbox{=} 50$, with four distinct edge probabilities: $0.1$, $0.15$, $0.2$, $0.25$ (marked on x-axis).
  • Figure 5: Results for experiment 2. As per Fig. \ref{['fig:experiment1']}, but the edge probability is linearly increasing from $0.05$ to $0.50$ in each time series of 100 graphs. In each time series, an abnormal graph is placed at $t \hbox{=} 50$, with edge probability $p_\ast \hbox{+} 0.2727$, where $p_\ast \in \{ 0.05, 0.10, 0.15, 0.20 \}$.
  • ...and 2 more figures