Table of Contents
Fetching ...

Unsupervised Graph Anomaly Detection via Multi-Hypersphere Heterophilic Graph Learning

Hang Ni, Jindong Han, Nengjun Zhu, Hao Liu

TL;DR

This work tackles unsupervised Graph Anomaly Detection by addressing two core problems: (i) homophily-driven indistinguishability of anomalies in GNNs, and (ii) uniform global scoring that overlooks local context. The authors introduce MHetGL, a two-stage framework combining Heterophilic Graph Encoding (HGE) to purify and augment neighborhoods and Multi-Hypersphere Learning (MHL) to model both global and local normal patterns through multiple hyperspheres, with a regularization strategy to prevent collapse. Empirical results on real-world datasets show significant improvements over 14 baselines, including robustness on organic data and scalability to large graphs. The approach provides a practical, end-to-end unsupervised GAD solution with strong interpretability through curvature-based refinement, GDV-informed augmentation, and community-aware hypersphere modeling.

Abstract

Graph Anomaly Detection (GAD) plays a vital role in various data mining applications such as e-commerce fraud prevention and malicious user detection. Recently, Graph Neural Network (GNN) based approach has demonstrated great effectiveness in GAD by first encoding graph data into low-dimensional representations and then identifying anomalies under the guidance of supervised or unsupervised signals. However, existing GNN-based approaches implicitly follow the homophily principle (i.e., the "like attracts like" phenomenon) and fail to learn discriminative embedding for anomalies that connect vast normal nodes. Moreover, such approaches identify anomalies in a unified global perspective but overlook diversified abnormal patterns conditioned on local graph context, leading to suboptimal performance. To overcome the aforementioned limitations, in this paper, we propose a Multi-hypersphere Heterophilic Graph Learning (MHetGL) framework for unsupervised GAD. Specifically, we first devise a Heterophilic Graph Encoding (HGE) module to learn distinguishable representations for potential anomalies by purifying and augmenting their neighborhood in a fully unsupervised manner. Then, we propose a Multi-Hypersphere Learning (MHL) module to enhance the detection capability for context-dependent anomalies by jointly incorporating critical patterns from both global and local perspectives. Extensive experiments on ten real-world datasets show that MHetGL outperforms 14 baselines. Our code is publicly available at https://github.com/KennyNH/MHetGL.

Unsupervised Graph Anomaly Detection via Multi-Hypersphere Heterophilic Graph Learning

TL;DR

This work tackles unsupervised Graph Anomaly Detection by addressing two core problems: (i) homophily-driven indistinguishability of anomalies in GNNs, and (ii) uniform global scoring that overlooks local context. The authors introduce MHetGL, a two-stage framework combining Heterophilic Graph Encoding (HGE) to purify and augment neighborhoods and Multi-Hypersphere Learning (MHL) to model both global and local normal patterns through multiple hyperspheres, with a regularization strategy to prevent collapse. Empirical results on real-world datasets show significant improvements over 14 baselines, including robustness on organic data and scalability to large graphs. The approach provides a practical, end-to-end unsupervised GAD solution with strong interpretability through curvature-based refinement, GDV-informed augmentation, and community-aware hypersphere modeling.

Abstract

Graph Anomaly Detection (GAD) plays a vital role in various data mining applications such as e-commerce fraud prevention and malicious user detection. Recently, Graph Neural Network (GNN) based approach has demonstrated great effectiveness in GAD by first encoding graph data into low-dimensional representations and then identifying anomalies under the guidance of supervised or unsupervised signals. However, existing GNN-based approaches implicitly follow the homophily principle (i.e., the "like attracts like" phenomenon) and fail to learn discriminative embedding for anomalies that connect vast normal nodes. Moreover, such approaches identify anomalies in a unified global perspective but overlook diversified abnormal patterns conditioned on local graph context, leading to suboptimal performance. To overcome the aforementioned limitations, in this paper, we propose a Multi-hypersphere Heterophilic Graph Learning (MHetGL) framework for unsupervised GAD. Specifically, we first devise a Heterophilic Graph Encoding (HGE) module to learn distinguishable representations for potential anomalies by purifying and augmenting their neighborhood in a fully unsupervised manner. Then, we propose a Multi-Hypersphere Learning (MHL) module to enhance the detection capability for context-dependent anomalies by jointly incorporating critical patterns from both global and local perspectives. Extensive experiments on ten real-world datasets show that MHetGL outperforms 14 baselines. Our code is publicly available at https://github.com/KennyNH/MHetGL.

Paper Structure

This paper contains 22 sections, 2 theorems, 16 equations, 10 figures, 5 tables.

Key Result

Proposition 1

In global hypersphere learning, with the regularization of $\tilde{\mathcal{L}}^{clu}$, the constant graph encoder $\Phi_g(\mathbf{x}_i)\equiv\mathbf{c}_0,\ \forall v_i\in\mathcal{V}^{train}$ do not minimize $\mathcal{L}^{glo}$.

Figures (10)

  • Figure 1: The overall architecture of MHetGL.
  • Figure 2: The distribution discrepancy of two proposed measurements between anomalous-normal edges and anomalous-anomalous edges in the Citeseer dataset.
  • Figure 3: The distribution of two proposed measurements in the Cora dataset.
  • Figure 4: The distribution of two types of edge weights in the Weibo dataset.
  • Figure 5: The comparison of GREET and our neighborhood purification block.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Definition 1
  • Proposition 1
  • Proposition 2