Trivial Graph Features and Classical Learning are Enough to Detect Random Anomalies

Matthieu Latapy; Stephany Rajeh

Trivial Graph Features and Classical Learning are Enough to Detect Random Anomalies

Matthieu Latapy, Stephany Rajeh

TL;DR

It is shown here that trivial graph features and classical learning techniques are sufficient to detect anomalies extremely well and this basic approach has very low computational costs and it leads to easily interpretable results.

Abstract

Detecting anomalies in link streams that represent various kinds of interactions is an important research topic with crucial applications. Because of the lack of ground truth data, proposed methods are mostly evaluated through their ability to detect randomly injected links. In contrast with most proposed methods, that rely on complex approaches raising computational and/or interpretability issues, we show here that trivial graph features and classical learning techniques are sufficient to detect such anomalies extremely well. This basic approach has very low computational costs and it leads to easily interpretable results. It also has many other desirable properties that we study through an extensive set of experiments. We conclude that detection methods should now target more complex kinds of anomalies.

Trivial Graph Features and Classical Learning are Enough to Detect Random Anomalies

TL;DR

Abstract

Paper Structure (16 sections, 8 figures, 4 tables, 1 algorithm)

This paper contains 16 sections, 8 figures, 4 tables, 1 algorithm.

Introduction
Related work
History graphs
Trivial graph features
Experimental setup
Datasets
Anomaly injection
Learning method
Experimental results
Comparison to state-of-the-art
Using diverse resolutions
TGF with sliding windows
TGF in practice
Interpretability
Computational costs
...and 1 more sections

Figures (8)

Figure 1: Examples of $G$-type and $H$-type history graphs. Top: a link stream between 4 nodes $a$, $b$, $c$, and $d$, from time $0$ to $10$. We consider the latest link $(10,b,c)$, meaning that an interaction occurred between $b$ and $c$ at time $10$. We display two $G$-type ($G_3$ and $G_8$, bottom-left) and two $H$-type ($H_3$ and $H_8$, bottom-right) history graphs for this link. The integers on the links of these graphs indicate their number of occurrences within the considered history. For instance, $H_3$ is the graph obtained from the $3$ last interactions. They involve $c$ and $d$ twice and $a$ and $b$ once.
Figure 2: The impact of size $s$ and duration $d$ (horizontal axis) of the $H$-type (left) and $G$-type (right) history graphs on AUC scores. We consider here $5$% anomaly injection and learning with $r=0.7$.
Figure 3: AUC scores obtained with various history resolutions and combinations. For each dataset, we display the best score obtained with, from left to right: a $H$-type history graph, a $G$-type history graph, the combination of all $H$-type history graphs, the combination of all $G$-type history graphs, and the combination of all these history graphs. We consider 5% anomaly injection and learning with $r=0.7$.
Figure 4: AUC scores for TGF with sliding windows containing $50\%$ of all links, using $H$-type history graphs of size $1000$, with $5$% anomaly injection and learning rate $r=0.7$ in each window. The inset shows results for sliding windows containing only $1\%$ of all links in the largest datasets.
Figure 5: The impact of size $s$ and duration $d$ (horizontal axis) of the $H$-type (top) and $G$-type (bottom) history graphs on AUC scores in the Digg dataset. We also show the impact of the usage of different machine learning algorithms on $H$-type history graphs. We consider here $5$% anomaly injection and learning with $r=0.7$.
...and 3 more figures

Trivial Graph Features and Classical Learning are Enough to Detect Random Anomalies

TL;DR

Abstract

Trivial Graph Features and Classical Learning are Enough to Detect Random Anomalies

Authors

TL;DR

Abstract

Table of Contents

Figures (8)