GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

Amit Roy; Juan Shu; Jia Li; Carl Yang; Olivier Elshocht; Jeroen Smeets; Pan Li

GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

Amit Roy, Juan Shu, Jia Li, Carl Yang, Olivier Elshocht, Jeroen Smeets, Pan Li

TL;DR

GAD-NR introduces Graph Anomaly Detection via Neighborhood Reconstruction, a GAE variant that reconstructs a node's entire one-hop neighborhood (self-features, degree, and neighbor-feature distribution) using a Gaussian-approximation decoder. By modeling neighbor representations as $\mathcal{N}(\mu_u, \Sigma_u)$ and aligning it with a reconstructed distribution via KL divergence, GAD-NR achieves robust detection of contextual, structural, and joint-type anomalies with improved scalability compared to prior NWR-GAE approaches (complexity $O(d)$ vs $O(d^3)$). Extensive experiments on six real-world datasets show significant gains (up to 30% in AUC) over state-of-the-art baselines and demonstrate strong type-robustness across anomaly categories. The approach is practical, with fixed hyperparameters exhibiting competitive performance across diverse graphs, and the source code is openly available.

Abstract

Graph Anomaly Detection (GAD) is a technique used to identify abnormal nodes within graphs, finding applications in network security, fraud detection, social media spam detection, and various other domains. A common method for GAD is Graph Auto-Encoders (GAEs), which encode graph data into node representations and identify anomalies by assessing the reconstruction quality of the graphs based on these representations. However, existing GAE models are primarily optimized for direct link reconstruction, resulting in nodes connected in the graph being clustered in the latent space. As a result, they excel at detecting cluster-type structural anomalies but struggle with more complex structural anomalies that do not conform to clusters. To address this limitation, we propose a novel solution called GAD-NR, a new variant of GAE that incorporates neighborhood reconstruction for graph anomaly detection. GAD-NR aims to reconstruct the entire neighborhood of a node, encompassing the local structure, self-attributes, and neighbor attributes, based on the corresponding node representation. By comparing the neighborhood reconstruction loss between anomalous nodes and normal nodes, GAD-NR can effectively detect any anomalies. Extensive experimentation conducted on six real-world datasets validates the effectiveness of GAD-NR, showcasing significant improvements (by up to 30% in AUC) over state-of-the-art competitors. The source code for GAD-NR is openly available. Importantly, the comparative analysis reveals that the existing methods perform well only in detecting one or two types of anomalies out of the three types studied. In contrast, GAD-NR excels at detecting all three types of anomalies across the datasets, demonstrating its comprehensive anomaly detection capabilities.

GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

TL;DR

and aligning it with a reconstructed distribution via KL divergence, GAD-NR achieves robust detection of contextual, structural, and joint-type anomalies with improved scalability compared to prior NWR-GAE approaches (complexity

). Extensive experiments on six real-world datasets show significant gains (up to 30% in AUC) over state-of-the-art baselines and demonstrate strong type-robustness across anomaly categories. The approach is practical, with fixed hyperparameters exhibiting competitive performance across diverse graphs, and the source code is openly available.

Abstract

Paper Structure (28 sections, 8 equations, 3 figures, 7 tables, 1 algorithm)

This paper contains 28 sections, 8 equations, 3 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Notations and Problem Formulation
Methodology
Motivations
GAE via Neighborhood Reconstruction
The encoder
The decoder
Neighbors' representation distribution reconstruction.
The overall reconstruction loss.
Anomaly Detection
Improvements over NWR-GAE
Experiment
Datasets and Baselines
Experimental Settings
...and 13 more sections

Figures (3)

Figure 1: Contextual anomalies are feature-wise different, structural anomalies form dense subgraphs in the network and joint-type anomalies connect with many nodes with different features.
Figure 2: Model architecture of GAD-NR. The encoder (left) performs dimension reduction with an MLP followed by a message passing GNN to obtain the hidden representation of a node. The decoder (right) reconstructs the self features and the node degree via MLPs and estimates the neighbor feature distribution with an MLP-predicted re-parameterized Gaussian distribution. Reconstructions of the self features and the node degree are optimized with MSE-loss whereas the KL-divergence between the ground truth and the learned neighbors' feature distribution is used for the optimization of the distribution estimation.
Figure 3: Impacts of varying feature reconstruction weight loss $\lambda_x'$, degree reconstruction weight loss $\lambda_d'$ and neighbor reconstruction weight loss $\lambda_n'$ in Eq. \ref{['eq:obj']} on detecting different types of anomalies in the Cora (top) and Books (bottom) dataset.

GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

TL;DR

Abstract

GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)