Table of Contents
Fetching ...

Semi-supervised Graph Anomaly Detection via Robust Homophily Learning

Guoguo Ai, Hezhe Qiao, Hui Yan, Guansong Pang

TL;DR

This work tackles semi-supervised graph anomaly detection under heterogeneous normal-node homophily. It introduces Robust Homophily Learning (RHO), which combines AdaFreq adaptive spectral filters with Cross-channel and Channel-wise views, and Graph Normality Alignment (GNA) to enforce consistency between views. Empirical results on eight real-world datasets show RHO outperforms state-of-the-art methods, especially when labeled normals exhibit diverse homophily, and demonstrates robustness to training size and label contamination. The approach offers a scalable, transferable framework for reliable anomaly detection in graphs with varied normal patterns, with a noted transductive limitation and potential for inductive extension in the future.

Abstract

Semi-supervised graph anomaly detection (GAD) utilizes a small set of labeled normal nodes to identify abnormal nodes from a large set of unlabeled nodes in a graph. Current methods in this line posit that 1) normal nodes share a similar level of homophily and 2) the labeled normal nodes can well represent the homophily patterns in the normal class. However, this assumption often does not hold well since normal nodes in a graph can exhibit diverse homophily in real-world GAD datasets. In this paper, we propose RHO, namely Robust Homophily Learning, to adaptively learn such homophily patterns. RHO consists of two novel modules, adaptive frequency response filters (AdaFreq) and graph normality alignment (GNA). AdaFreq learns a set of adaptive spectral filters that capture different frequency components of the labeled normal nodes with varying homophily in the channel-wise and cross-channel views of node attributes. GNA is introduced to enforce consistency between the channel-wise and cross-channel homophily representations to robustify the normality learned by the filters in the two views. Experiments on eight real-world GAD datasets show that RHO can effectively learn varying, often under-represented, homophily in the small normal node set and substantially outperforms state-of-the-art competing methods. Code is available at https://github.com/mala-lab/RHO.

Semi-supervised Graph Anomaly Detection via Robust Homophily Learning

TL;DR

This work tackles semi-supervised graph anomaly detection under heterogeneous normal-node homophily. It introduces Robust Homophily Learning (RHO), which combines AdaFreq adaptive spectral filters with Cross-channel and Channel-wise views, and Graph Normality Alignment (GNA) to enforce consistency between views. Empirical results on eight real-world datasets show RHO outperforms state-of-the-art methods, especially when labeled normals exhibit diverse homophily, and demonstrates robustness to training size and label contamination. The approach offers a scalable, transferable framework for reliable anomaly detection in graphs with varied normal patterns, with a noted transductive limitation and potential for inductive extension in the future.

Abstract

Semi-supervised graph anomaly detection (GAD) utilizes a small set of labeled normal nodes to identify abnormal nodes from a large set of unlabeled nodes in a graph. Current methods in this line posit that 1) normal nodes share a similar level of homophily and 2) the labeled normal nodes can well represent the homophily patterns in the normal class. However, this assumption often does not hold well since normal nodes in a graph can exhibit diverse homophily in real-world GAD datasets. In this paper, we propose RHO, namely Robust Homophily Learning, to adaptively learn such homophily patterns. RHO consists of two novel modules, adaptive frequency response filters (AdaFreq) and graph normality alignment (GNA). AdaFreq learns a set of adaptive spectral filters that capture different frequency components of the labeled normal nodes with varying homophily in the channel-wise and cross-channel views of node attributes. GNA is introduced to enforce consistency between the channel-wise and cross-channel homophily representations to robustify the normality learned by the filters in the two views. Experiments on eight real-world GAD datasets show that RHO can effectively learn varying, often under-represented, homophily in the small normal node set and substantially outperforms state-of-the-art competing methods. Code is available at https://github.com/mala-lab/RHO.

Paper Structure

This paper contains 30 sections, 2 theorems, 16 equations, 11 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Let $\{\lambda_m\}$ and $\{\mathbf{u}_{m}\}$ be the graph frequencies and frequency components respectively, $\beta_m$ is the projection coefficient of signal $\mathbf{x}$ onto the $m$-th eigenvector $\mathbf{u}_m$, then we have $g\left(\lambda_{m}\right)=\frac{\sum_{i \in \mathcal{V}_l} u_{m}(i)}{\

Figures (11)

  • Figure 1: (a) Homophily distribution of normal nodes in Amazon dou2020enhancing and Elliptic weber2019anti. (b) and (c) AUC comparison of GCN filter wang2021one, BWGNN filter tang2022rethinking, and AdaFreq filter on the two datasets under three cases of homophily levels. The three cases are normal nodes sampled from the sets of (low-homophily, high-homophily) normal nodes, where nodes with homophily greater than 0.9 for Amazon and 0.7 for Elliptic are considered as high homophily. The datasets in the three cases are sampled using a ratio of (80%, 20%), (50%, 50%), and (20%, 80%) to the two node sets, respectively.
  • Figure 2: Overview of the proposed RHO framework. The input graph consists of labeled normal nodes and unlabeled normal/abnormal nodes, where nodes $v_i,v_j \in \mathcal{V}_l$ represent normal nodes with high and low homophily, respectively. (a) AdaFreq learns adaptive filters on both cross-channel and channel-wise representations with learnable parameters to different feature channels. (b) GNA aligns the heterogeneous normal patterns learned from the adaptive filters in the two views. (c) The normal nodes with diverse homophily are enforced to project closer to the center of a hypersphere via a widely-used one-class loss, while anomaly nodes being distant from the center.
  • Figure 3: Anomaly score distributions and t-SNE visualizations of node embeddings generated by the cross-channel view ((a) and (d)), the channel-wise view ((b) and (e)), and the full RHO model ((c) and (f)) on Amazon dou2020enhancing.
  • Figure 4: The filter curves learned by RHO on Amazon and T-Finance, where the black dashed line represents the baseline filter response from GCN, the black solid line indicates the cross-channel response, and the colored solid lines are the channel-wise responses.
  • Figure 5: AUROC and AUPRC results of RHO w.r.t. hyperparamer $\alpha$.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Definition A.1
  • Theorem 2
  • Proof 1