Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective
Di Jin, Jingyi Cao, Xiaobao Wang, Bingdao Feng, Dongxiao He, Longbiao Wang, Jianwu Dang
TL;DR
This work addresses Graph Anomaly Detection with contrastive learning, identifying interfering edges as a key source of noise that degrades training. It introduces CVGAD, which combines multi-scale anomaly awareness (node-subgraph and node-node contrasts on anomalous and clean views) with a progressive purification mechanism that iteratively removes high-interference edges based on edge scores derived from node-level signals. The approach achieves stronger ROC-AUC performance than seven baselines across five datasets, supported by thorough ablations and parameter analyses. By jointly mitigating interference and leveraging dual-view contrast, CVGAD offers a robust framework for reliable graph anomaly detection with practical implications for security and fraud detection.
Abstract
Graph anomaly detection aims to identify unusual patterns in graph-based data, with wide applications in fields such as web security and financial fraud detection. Existing methods typically rely on contrastive learning, assuming that a lower similarity between a node and its local subgraph indicates abnormality. However, these approaches overlook a crucial limitation: the presence of interfering edges invalidates this assumption, since it introduces disruptive noise that compromises the contrastive learning process. Consequently, this limitation impairs the ability to effectively learn meaningful representations of normal patterns, leading to suboptimal detection performance. To address this issue, we propose a Clean-View Enhanced Graph Anomaly Detection framework (CVGAD), which includes a multi-scale anomaly awareness module to identify key sources of interference in the contrastive learning process. Moreover, to mitigate bias from the one-step edge removal process, we introduce a novel progressive purification module. This module incrementally refines the graph by iteratively identifying and removing interfering edges, thereby enhancing model performance. Extensive experiments on five benchmark datasets validate the effectiveness of our approach.
