Table of Contents
Fetching ...

SmoothGNN: Smoothing-aware GNN for Unsupervised Node Anomaly Detection

Xiangyu Dong, Xingyi Zhang, Yanni Sun, Lei Chen, Mingxuan Yuan, Sibo Wang

TL;DR

This work proposes SmoothGNN, a novel unsupervised NAD framework, which designs a learning component to explicitly capture ISP for detecting node anomalies, and designs an effective coefficient based on the findings that NSP can serve as coefficients for node representations, aiding in the identification of anomalous nodes.

Abstract

The smoothing issue in graph learning leads to indistinguishable node representations, posing significant challenges for graph-related tasks. However, our experiments reveal that this problem can uncover underlying properties of node anomaly detection (NAD) that previous research has missed. We introduce Individual Smoothing Patterns (ISP) and Neighborhood Smoothing Patterns (NSP), which indicate that the representations of anomalous nodes are harder to smooth than those of normal ones. In addition, we explore the theoretical implications of these patterns, demonstrating the potential benefits of ISP and NSP for NAD tasks. Motivated by these findings, we propose SmoothGNN, a novel unsupervised NAD framework. First, we design a learning component to explicitly capture ISP for detecting node anomalies. Second, we design a spectral graph neural network to implicitly learn ISP to enhance detection. Third, we design an effective coefficient based on our findings that NSP can serve as coefficients for node representations, aiding in the identification of anomalous nodes. Furthermore, we devise a novel anomaly measure to calculate loss functions and anomalous scores for nodes, reflecting the properties of NAD using ISP and NSP. Extensive experiments on 9 real datasets show that SmoothGNN outperforms the best rival by an average of 14.66% in AUC and 7.28% in Average Precision, with 75x running time speedup, validating the effectiveness and efficiency of our framework.

SmoothGNN: Smoothing-aware GNN for Unsupervised Node Anomaly Detection

TL;DR

This work proposes SmoothGNN, a novel unsupervised NAD framework, which designs a learning component to explicitly capture ISP for detecting node anomalies, and designs an effective coefficient based on the findings that NSP can serve as coefficients for node representations, aiding in the identification of anomalous nodes.

Abstract

The smoothing issue in graph learning leads to indistinguishable node representations, posing significant challenges for graph-related tasks. However, our experiments reveal that this problem can uncover underlying properties of node anomaly detection (NAD) that previous research has missed. We introduce Individual Smoothing Patterns (ISP) and Neighborhood Smoothing Patterns (NSP), which indicate that the representations of anomalous nodes are harder to smooth than those of normal ones. In addition, we explore the theoretical implications of these patterns, demonstrating the potential benefits of ISP and NSP for NAD tasks. Motivated by these findings, we propose SmoothGNN, a novel unsupervised NAD framework. First, we design a learning component to explicitly capture ISP for detecting node anomalies. Second, we design a spectral graph neural network to implicitly learn ISP to enhance detection. Third, we design an effective coefficient based on our findings that NSP can serve as coefficients for node representations, aiding in the identification of anomalous nodes. Furthermore, we devise a novel anomaly measure to calculate loss functions and anomalous scores for nodes, reflecting the properties of NAD using ISP and NSP. Extensive experiments on 9 real datasets show that SmoothGNN outperforms the best rival by an average of 14.66% in AUC and 7.28% in Average Precision, with 75x running time speedup, validating the effectiveness and efficiency of our framework.
Paper Structure (24 sections, 6 theorems, 28 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 24 sections, 6 theorems, 28 equations, 10 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

Let $\boldsymbol{P} = \frac{\boldsymbol{I_n}+\tilde{\boldsymbol{A}}}{2}$ denote the propagation matrix given the adjacency matrix $\tilde{\boldsymbol{A}}$. For an augmented propagation matrix $\boldsymbol{B}^t=(\boldsymbol{P}-\boldsymbol{P}^\infty)^t$, where $\boldsymbol{P}^\infty$ represents the co where $\mathbb{I}[\cdot]$ is the indicator function, $a_{i,j}$ is the $(i,j)$-th entry of the adjac

Figures (10)

  • Figure 1: Smoothing Patterns of Amazon.
  • Figure 2: Smoothing Patterns of T-Finance.
  • Figure 3: Varying the standard deviation, learning rate, hop, and hidden dimension.
  • Figure 4: Smoothing Patterns of Reddit.
  • Figure 5: Smoothing Patterns of Tolokers.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Theorem 1
  • Theorem 2
  • Definition 1: tang22bwgnndong24rqgnn
  • Theorem 3
  • Definition 2: rong20dropedge
  • Theorem 4
  • Lemma 1
  • Corollary 1