Unsupervised Graph Anomaly Detection via Multi-Hypersphere Heterophilic Graph Learning
Hang Ni, Jindong Han, Nengjun Zhu, Hao Liu
TL;DR
This work tackles unsupervised Graph Anomaly Detection by addressing two core problems: (i) homophily-driven indistinguishability of anomalies in GNNs, and (ii) uniform global scoring that overlooks local context. The authors introduce MHetGL, a two-stage framework combining Heterophilic Graph Encoding (HGE) to purify and augment neighborhoods and Multi-Hypersphere Learning (MHL) to model both global and local normal patterns through multiple hyperspheres, with a regularization strategy to prevent collapse. Empirical results on real-world datasets show significant improvements over 14 baselines, including robustness on organic data and scalability to large graphs. The approach provides a practical, end-to-end unsupervised GAD solution with strong interpretability through curvature-based refinement, GDV-informed augmentation, and community-aware hypersphere modeling.
Abstract
Graph Anomaly Detection (GAD) plays a vital role in various data mining applications such as e-commerce fraud prevention and malicious user detection. Recently, Graph Neural Network (GNN) based approach has demonstrated great effectiveness in GAD by first encoding graph data into low-dimensional representations and then identifying anomalies under the guidance of supervised or unsupervised signals. However, existing GNN-based approaches implicitly follow the homophily principle (i.e., the "like attracts like" phenomenon) and fail to learn discriminative embedding for anomalies that connect vast normal nodes. Moreover, such approaches identify anomalies in a unified global perspective but overlook diversified abnormal patterns conditioned on local graph context, leading to suboptimal performance. To overcome the aforementioned limitations, in this paper, we propose a Multi-hypersphere Heterophilic Graph Learning (MHetGL) framework for unsupervised GAD. Specifically, we first devise a Heterophilic Graph Encoding (HGE) module to learn distinguishable representations for potential anomalies by purifying and augmenting their neighborhood in a fully unsupervised manner. Then, we propose a Multi-Hypersphere Learning (MHL) module to enhance the detection capability for context-dependent anomalies by jointly incorporating critical patterns from both global and local perspectives. Extensive experiments on ten real-world datasets show that MHetGL outperforms 14 baselines. Our code is publicly available at https://github.com/KennyNH/MHetGL.
