GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction
Amit Roy, Juan Shu, Jia Li, Carl Yang, Olivier Elshocht, Jeroen Smeets, Pan Li
TL;DR
GAD-NR introduces Graph Anomaly Detection via Neighborhood Reconstruction, a GAE variant that reconstructs a node's entire one-hop neighborhood (self-features, degree, and neighbor-feature distribution) using a Gaussian-approximation decoder. By modeling neighbor representations as $\mathcal{N}(\mu_u, \Sigma_u)$ and aligning it with a reconstructed distribution via KL divergence, GAD-NR achieves robust detection of contextual, structural, and joint-type anomalies with improved scalability compared to prior NWR-GAE approaches (complexity $O(d)$ vs $O(d^3)$). Extensive experiments on six real-world datasets show significant gains (up to 30% in AUC) over state-of-the-art baselines and demonstrate strong type-robustness across anomaly categories. The approach is practical, with fixed hyperparameters exhibiting competitive performance across diverse graphs, and the source code is openly available.
Abstract
Graph Anomaly Detection (GAD) is a technique used to identify abnormal nodes within graphs, finding applications in network security, fraud detection, social media spam detection, and various other domains. A common method for GAD is Graph Auto-Encoders (GAEs), which encode graph data into node representations and identify anomalies by assessing the reconstruction quality of the graphs based on these representations. However, existing GAE models are primarily optimized for direct link reconstruction, resulting in nodes connected in the graph being clustered in the latent space. As a result, they excel at detecting cluster-type structural anomalies but struggle with more complex structural anomalies that do not conform to clusters. To address this limitation, we propose a novel solution called GAD-NR, a new variant of GAE that incorporates neighborhood reconstruction for graph anomaly detection. GAD-NR aims to reconstruct the entire neighborhood of a node, encompassing the local structure, self-attributes, and neighbor attributes, based on the corresponding node representation. By comparing the neighborhood reconstruction loss between anomalous nodes and normal nodes, GAD-NR can effectively detect any anomalies. Extensive experimentation conducted on six real-world datasets validates the effectiveness of GAD-NR, showcasing significant improvements (by up to 30% in AUC) over state-of-the-art competitors. The source code for GAD-NR is openly available. Importantly, the comparative analysis reveals that the existing methods perform well only in detecting one or two types of anomalies out of the three types studied. In contrast, GAD-NR excels at detecting all three types of anomalies across the datasets, demonstrating its comprehensive anomaly detection capabilities.
