Table of Contents
Fetching ...

SLADE: Detecting Dynamic Anomalies in Edge Streams without Labels via Self-Supervised Learning

Jongha Lee, Sunwoo Kim, Kijung Shin

TL;DR

SLADE addresses dynamic anomaly detection on continuous-time dynamic graphs without labels by learning stable long-term node interaction patterns and short-term regeneration through two self-supervised tasks. It employs a memory-augmented architecture with a GRU-based memory updater and a TGAT-based memory generator, achieving constant-time per-edge inference while updating node representations incrementally. Empirical results on four real-world datasets show SLADE outperforming nine baselines (including supervised ones) in AUC, with strong ablation results highlighting the value of both the temporal contrast and memory-generation objectives. This approach enables scalable, real-time anomaly detection in evolving edge streams, with broad applicability to social, financial, and communication networks.

Abstract

To detect anomalies in real-world graphs, such as social, email, and financial networks, various approaches have been developed. While they typically assume static input graphs, most real-world graphs grow over time, naturally represented as edge streams. In this context, we aim to achieve three goals: (a) instantly detecting anomalies as they occur, (b) adapting to dynamically changing states, and (c) handling the scarcity of dynamic anomaly labels. In this paper, we propose SLADE (Self-supervised Learning for Anomaly Detection in Edge Streams) for rapid detection of dynamic anomalies in edge streams, without relying on labels. SLADE detects the shifts of nodes into abnormal states by observing deviations in their interaction patterns over time. To this end, it trains a deep neural network to perform two self-supervised tasks: (a) minimizing drift in node representations and (b) generating long-term interaction patterns from short-term ones. Failure in these tasks for a node signals its deviation from the norm. Notably, the neural network and tasks are carefully designed so that all required operations can be performed in constant time (w.r.t. the graph size) in response to each new edge in the input stream. In dynamic anomaly detection across four real-world datasets, SLADE outperforms nine competing methods, even those leveraging label supervision.

SLADE: Detecting Dynamic Anomalies in Edge Streams without Labels via Self-Supervised Learning

TL;DR

SLADE addresses dynamic anomaly detection on continuous-time dynamic graphs without labels by learning stable long-term node interaction patterns and short-term regeneration through two self-supervised tasks. It employs a memory-augmented architecture with a GRU-based memory updater and a TGAT-based memory generator, achieving constant-time per-edge inference while updating node representations incrementally. Empirical results on four real-world datasets show SLADE outperforming nine baselines (including supervised ones) in AUC, with strong ablation results highlighting the value of both the temporal contrast and memory-generation objectives. This approach enables scalable, real-time anomaly detection in evolving edge streams, with broad applicability to social, financial, and communication networks.

Abstract

To detect anomalies in real-world graphs, such as social, email, and financial networks, various approaches have been developed. While they typically assume static input graphs, most real-world graphs grow over time, naturally represented as edge streams. In this context, we aim to achieve three goals: (a) instantly detecting anomalies as they occur, (b) adapting to dynamically changing states, and (c) handling the scarcity of dynamic anomaly labels. In this paper, we propose SLADE (Self-supervised Learning for Anomaly Detection in Edge Streams) for rapid detection of dynamic anomalies in edge streams, without relying on labels. SLADE detects the shifts of nodes into abnormal states by observing deviations in their interaction patterns over time. To this end, it trains a deep neural network to perform two self-supervised tasks: (a) minimizing drift in node representations and (b) generating long-term interaction patterns from short-term ones. Failure in these tasks for a node signals its deviation from the norm. Notably, the neural network and tasks are carefully designed so that all required operations can be performed in constant time (w.r.t. the graph size) in response to each new edge in the input stream. In dynamic anomaly detection across four real-world datasets, SLADE outperforms nine competing methods, even those leveraging label supervision.
Paper Structure (26 sections, 11 equations, 4 figures, 7 tables)

This paper contains 26 sections, 11 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Overview of SLADE, whose objective is to measure the anomaly score of a query node at any time. For each newly arriving edge, SLADE updates the memory vector of each endpoint using GRU. Given a query node, SLADE masks the memory vector of the node and approximately regenerates it based on its recent interactions using TGAT. Then, it measures the anomaly score of the query node based on the similarities (1) between previous and current memory vectors (related to S1) and (2) between current and generated memory vectors (related to S2). SLADE aims to maximize these similarities for model training.
  • Figure 2: AUC (in %) when varying the test start ratio. For learning-based methods, temporal edges preceding the test start ratio in the dataset are employed for training. If validation is needed, the last 10% of the training set is used for validation. Note that SLADE performs best in most cases.
  • Figure 3: The left figure shows the linear increase of the running time of SLADE with respect to the number of edges in the Reddit dataset. The right figure shows the trade-off between detection speed and accuracy (with standard deviations) in the Wikipedia dataset provided by the competing methods. The baseline methods with AUC scores below 60% are excluded from consideration to enhance the clarity of performance differences between the methods. SLADE exhibits constant processing time per edge (as proven in Section \ref{['sec:analysis']}), offering the best trade-off between speed and accuracy. For a training-time comparison, refer to Online Appendix D.2.
  • Figure 4: (a) and (b) show the distribution of anomaly scores assigned by SLADE to instances of each node type in the two synthetic datasets (visualization is based on Gaussian kernel density estimation). (c) and (d) show the anomaly scores at each time period. Note that in all figures, SLADE clearly distinguishes anomalies from normal nodes. For results from several baseline methods, refer to Online Appendix D.3.