Table of Contents
Fetching ...

DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector

Jinghan Li, Yuan Gao, Jinda Lu, Junfeng Fang, Congcong Wen, Hui Lin, Xiang Wang

TL;DR

This work tackles unsupervised graph anomaly detection by moving beyond reconstruction-based latent spaces to a diffusion-based framework that distills discriminative content into latent representations. DiffGAD employs a dual-diffusion setup to capture general and common content, then uses a discriminative-content distillation mechanism—via a classifier-free–style guidance—to separate normal and anomalous distributions. The approach is complemented by a content-preservation strategy that sustains informative structure across diffusion timesteps, enabling robust detection with favorable time and memory characteristics. Empirical results on six large-scale real-world datasets show state-of-the-art performance and strong robustness, highlighting DiffGAD’s practical relevance for scalable graph anomaly detection.

Abstract

Graph Anomaly Detection (GAD) is crucial for identifying abnormal entities within networks, garnering significant attention across various fields. Traditional unsupervised methods, which decode encoded latent representations of unlabeled data with a reconstruction focus, often fail to capture critical discriminative content, leading to suboptimal anomaly detection. To address these challenges, we present a Diffusion-based Graph Anomaly Detector (DiffGAD). At the heart of DiffGAD is a novel latent space learning paradigm, meticulously designed to enhance its proficiency by guiding it with discriminative content. This innovative approach leverages diffusion sampling to infuse the latent space with discriminative content and introduces a content-preservation mechanism that retains valuable information across different scales, significantly improving its adeptness at identifying anomalies with limited time and space complexity. Our comprehensive evaluation of DiffGAD, conducted on six real-world and large-scale datasets with various metrics, demonstrated its exceptional performance.

DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector

TL;DR

This work tackles unsupervised graph anomaly detection by moving beyond reconstruction-based latent spaces to a diffusion-based framework that distills discriminative content into latent representations. DiffGAD employs a dual-diffusion setup to capture general and common content, then uses a discriminative-content distillation mechanism—via a classifier-free–style guidance—to separate normal and anomalous distributions. The approach is complemented by a content-preservation strategy that sustains informative structure across diffusion timesteps, enabling robust detection with favorable time and memory characteristics. Empirical results on six large-scale real-world datasets show state-of-the-art performance and strong robustness, highlighting DiffGAD’s practical relevance for scalable graph anomaly detection.

Abstract

Graph Anomaly Detection (GAD) is crucial for identifying abnormal entities within networks, garnering significant attention across various fields. Traditional unsupervised methods, which decode encoded latent representations of unlabeled data with a reconstruction focus, often fail to capture critical discriminative content, leading to suboptimal anomaly detection. To address these challenges, we present a Diffusion-based Graph Anomaly Detector (DiffGAD). At the heart of DiffGAD is a novel latent space learning paradigm, meticulously designed to enhance its proficiency by guiding it with discriminative content. This innovative approach leverages diffusion sampling to infuse the latent space with discriminative content and introduces a content-preservation mechanism that retains valuable information across different scales, significantly improving its adeptness at identifying anomalies with limited time and space complexity. Our comprehensive evaluation of DiffGAD, conducted on six real-world and large-scale datasets with various metrics, demonstrated its exceptional performance.

Paper Structure

This paper contains 32 sections, 13 equations, 10 figures, 14 tables, 1 algorithm.

Figures (10)

  • Figure 1: Given several normal and abnormal nodes, the data space is constructed by different methods. Specifically, (a) represents the latent space constructed by current reconstruction-based methods, (b) denotes the latent space learned by our discriminative guidance, (c) is the latent space constructed by introducing the preserved general content on (b).
  • Figure 2: An overview of DiffGAD. Given a graph, we first encode it into latent space, and we then reconstruct it with both unconditioned and conditioned diffusion models to distill the discriminative content. Finally, we decode the reconstructed latent embedding for anomaly detection.
  • Figure 3: The ROC-AUC performance of 3 representative datasets under the different scale of the control of $\lambda$, where $x$ axis represents the variance of $\lambda$, and $y$ axis is the ROC-AUC results.
  • Figure 4: The ROC-AUC performance of different timesteps $t$ over 3 representative datasets, where $x$ axis represents different timesteps $t$, and $y$ axis is the ROC-AUC results.
  • Figure 5: Average ROC-AUC performance over 5 datasets, where the color represents the average AUC, and the central line is the median (Many methods for Dgraph encounter OOM and TLM restriction, thus Dgraph is omitted).
  • ...and 5 more figures