Table of Contents
Fetching ...

UniGAD: Unifying Multi-level Graph Anomaly Detection

Yiqing Lin, Jianheng Tang, Chenyi Zi, H. Vicky Zhao, Yuan Yao, Jia Li

TL;DR

The Maximum Rayleigh Quotient Subgraph Sampler (MRQSampler) is developed that unifies multi-level formats by transferring objects at each level into graph-level tasks on subgraphs and theoretically proves that MRQSampler maximizes the accumulated spectral energy of subgraphs to preserve the most significant anomaly information.

Abstract

Graph Anomaly Detection (GAD) aims to identify uncommon, deviated, or suspicious objects within graph-structured data. Existing methods generally focus on a single graph object type (node, edge, graph, etc.) and often overlook the inherent connections among different object types of graph anomalies. For instance, a money laundering transaction might involve an abnormal account and the broader community it interacts with. To address this, we present UniGAD, the first unified framework for detecting anomalies at node, edge, and graph levels jointly. Specifically, we develop the Maximum Rayleigh Quotient Subgraph Sampler (MRQSampler) that unifies multi-level formats by transferring objects at each level into graph-level tasks on subgraphs. We theoretically prove that MRQSampler maximizes the accumulated spectral energy of subgraphs (i.e., the Rayleigh quotient) to preserve the most significant anomaly information. To further unify multi-level training, we introduce a novel GraphStitch Network to integrate information across different levels, adjust the amount of sharing required at each level, and harmonize conflicting training goals. Comprehensive experiments show that UniGAD outperforms both existing GAD methods specialized for a single task and graph prompt-based approaches for multiple tasks, while also providing robust zero-shot task transferability. All codes can be found at https://github.com/lllyyq1121/UniGAD.

UniGAD: Unifying Multi-level Graph Anomaly Detection

TL;DR

The Maximum Rayleigh Quotient Subgraph Sampler (MRQSampler) is developed that unifies multi-level formats by transferring objects at each level into graph-level tasks on subgraphs and theoretically proves that MRQSampler maximizes the accumulated spectral energy of subgraphs to preserve the most significant anomaly information.

Abstract

Graph Anomaly Detection (GAD) aims to identify uncommon, deviated, or suspicious objects within graph-structured data. Existing methods generally focus on a single graph object type (node, edge, graph, etc.) and often overlook the inherent connections among different object types of graph anomalies. For instance, a money laundering transaction might involve an abnormal account and the broader community it interacts with. To address this, we present UniGAD, the first unified framework for detecting anomalies at node, edge, and graph levels jointly. Specifically, we develop the Maximum Rayleigh Quotient Subgraph Sampler (MRQSampler) that unifies multi-level formats by transferring objects at each level into graph-level tasks on subgraphs. We theoretically prove that MRQSampler maximizes the accumulated spectral energy of subgraphs (i.e., the Rayleigh quotient) to preserve the most significant anomaly information. To further unify multi-level training, we introduce a novel GraphStitch Network to integrate information across different levels, adjust the amount of sharing required at each level, and harmonize conflicting training goals. Comprehensive experiments show that UniGAD outperforms both existing GAD methods specialized for a single task and graph prompt-based approaches for multiple tasks, while also providing robust zero-shot task transferability. All codes can be found at https://github.com/lllyyq1121/UniGAD.

Paper Structure

This paper contains 27 sections, 6 theorems, 26 equations, 5 figures, 15 tables, 1 algorithm.

Key Result

Lemma 1

Rayleigh quotient $RQ(\boldsymbol{x},\boldsymbol{L})$, i.e. the accumulated spectral energy of the graph signal, is monotonically increasing with the anomaly degree.

Figures (5)

  • Figure 1: The overall framework of UniGAD.
  • Figure 2: Message passing in GNNs and rooted subtree sampling.
  • Figure 3: MRQSampler: (i) Derive the condition (Theorem \ref{['corollary2']}) satisfied with the optimal subtree. (ii) Decompose the problem into simpler sub-problems by recursing through the tree depth to solve the optimal subtree with the dynamic programming (DP) algorithm.
  • Figure 4: GraphStitch network structure in UniGAD. Node level is highlighted.
  • Figure 5: The evaluation of time and space efficiency metrics. We highlight the percentage of total execution time spent by MRQSampler.

Theorems & Definitions (11)

  • Definition 2.1: Multi-level GAD
  • Lemma 1: Tang, 2022
  • Lemma 2: Xu, 2018
  • Theorem 1
  • Corollary 1
  • Theorem 2
  • proof
  • proof
  • proof
  • Lemma 3: Dan's Favorite Inequality
  • ...and 1 more