Table of Contents
Fetching ...

Community detection and anomaly prediction in dynamic networks

Hadiseh Safdari, Caterina De Bacco

TL;DR

A principled approach to detect anomalies in dynamic networks that integrates community structure as a foundational model for regular behavior is presented, leveraging a Markovian framework for temporal transitions and latent variables for community and anomaly detection.

Abstract

Anomaly detection is an essential task in the analysis of dynamic networks, offering early warnings of abnormal behavior. We present a principled approach to detect anomalies in dynamic networks that integrates community structure as a foundational model for regular behavior. Our model identifies anomalies as irregular edges while capturing structural changes. Our approach leverages a Markovian framework for temporal transitions and latent variables for community and anomaly detection, inferring hidden parameters to detect unusual interactions. Evaluations on synthetic and real-world datasets show strong anomaly detection across various scenarios. In a case study on professional football player transfers, we detect patterns influenced by club wealth and country, as well as unexpected transactions both within and across community boundaries. This work provides a framework for adaptable anomaly detection, highlighting the value of integrating domain knowledge with data-driven techniques for improved interpretability and robustness in complex networks.

Community detection and anomaly prediction in dynamic networks

TL;DR

A principled approach to detect anomalies in dynamic networks that integrates community structure as a foundational model for regular behavior is presented, leveraging a Markovian framework for temporal transitions and latent variables for community and anomaly detection.

Abstract

Anomaly detection is an essential task in the analysis of dynamic networks, offering early warnings of abnormal behavior. We present a principled approach to detect anomalies in dynamic networks that integrates community structure as a foundational model for regular behavior. Our model identifies anomalies as irregular edges while capturing structural changes. Our approach leverages a Markovian framework for temporal transitions and latent variables for community and anomaly detection, inferring hidden parameters to detect unusual interactions. Evaluations on synthetic and real-world datasets show strong anomaly detection across various scenarios. In a case study on professional football player transfers, we detect patterns influenced by club wealth and country, as well as unexpected transactions both within and across community boundaries. This work provides a framework for adaptable anomaly detection, highlighting the value of integrating domain knowledge with data-driven techniques for improved interpretability and robustness in complex networks.
Paper Structure (30 sections, 44 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 30 sections, 44 equations, 8 figures, 5 tables, 1 algorithm.

Figures (8)

  • Figure 1: Anomaly detection in synthetic networks. The AUC(Z) metric quantifies the model's ability to distinguish between regular and anomalous edges. The synthetic network has $N = 300$ nodes, average degree $\langle k \rangle=8$, and $K = 8$ communities of equal-size unmixed group membership generated with our generative model. Here, $\beta=0.2$, $\ell=0.2$, $\phi=0.2$. Lines are averages and standard deviations over $10$ sampled networks. DynACD ($\circ$), and ACD on aggregated dataset ($\square$).
  • Figure 2: Anomaly Detection in Real-World Datasets. Comparison of the performance of DynACD, TADDY, and LOF in detecting anomalies across five real-world datasets. Each row corresponds to a dataset, with the left column showing the recall and the right column showing the AUC(Z) scores as the fraction of injected anomalies $\rho_{a}$ increases. DynACD (solid lines) demonstrates a significant improvement in both recall and AUC(Z), outperforming TADDY (dash-dotted lines) and LOF (dashed lines). LOF shows relatively poor performance, with AUC(Z) values close to random guessing and low recall scores, particularly at higher $\rho_{a}$ levels. Lines and shaded areas represent the mean and standard deviation over 10 sampled networks, respectively.
  • Figure 3: Transfermarkt datasets: Genoa transfer network: Visualization of player transfers to and from Genoa involving various clubs at different time steps. Notably, there is a consistent presence of transfers with Juventus and Inter Milan at most time steps.
  • Figure 4: Transfermarkt datasets: Edges Distribution by wealth category of clubs. Each cell in the heatmap represents the proportion of edges between clubs, distinguished by wealth categories (High, Average, and Low). We separate the set of edges considered in each heatmap between a) anomalous and b) regular, as estimated by DynACD. We normalize the heatmaps by the total number of edges considered in each plot, so that the sum over all entries in each heatmap equals 1. Darker color means higher proportion of edges exchanged between clubs in the given wealth categories.
  • Figure 5: Communities in the Transfermarkt dataset: In-coming (soft) community membership $v$ of clubs. The colors of the y-labels indicate the country to which the receiving clubs belong. This plot reveals an alignment between the community membership of clubs and their respective nationalities. The corresponding country of each league is shown in the legend on the right with the color assigned to that league.
  • ...and 3 more figures