Table of Contents
Fetching ...

Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time

Junjun Pan, Yixin Liu, Chuan Zhou, Fei Xiong, Alan Wee-Chung Liew, Shirui Pan

TL;DR

The paper tackles the problem of normality shift in graph anomaly detection caused by unseen normals at test time. It proposes TUNE, a plug-and-play, unsupervised test-time adaptation framework that aligns semantic distributions with a graph aligner and exploits aggregation contamination via a dual-branch architecture to guide adaptation without labeled data. A key idea is to use an aggregation-free auxiliary branch and an aggregation estimator to measure and mitigate contamination, enabling robust generalization to unseen normal patterns. Experimental results across 10 datasets show that TUNE improves generalization over state-of-the-art GAD methods and graph TTA baselines, with better scalability for large graphs, highlighting practical impact for dynamic, real-world graphs where normal behavior evolves.

Abstract

Graph anomaly detection (GAD), which aims to detect outliers in graph-structured data, has received increasing research attention recently. However, existing GAD methods assume identical training and testing distributions, which is rarely valid in practice. In real-world scenarios, unseen but normal samples may emerge during deployment, leading to a normality shift that degrades the performance of GAD models trained on the original data. Through empirical analysis, we reveal that the degradation arises from (1) semantic confusion, where unseen normal samples are misinterpreted as anomalies due to their novel patterns, and (2) aggregation contamination, where the representations of seen normal nodes are distorted by unseen normals through message aggregation. While retraining or fine-tuning GAD models could be a potential solution to the above challenges, the high cost of model retraining and the difficulty of obtaining labeled data often render this approach impractical in real-world applications. To bridge the gap, we proposed a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns (TUNE) in GAD. To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level. Moreover, we utilize the minimization of representation-level shift as a supervision signal to train the aligner, which leverages the estimated aggregation contamination as a key indicator of normality shift. Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.

Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time

TL;DR

The paper tackles the problem of normality shift in graph anomaly detection caused by unseen normals at test time. It proposes TUNE, a plug-and-play, unsupervised test-time adaptation framework that aligns semantic distributions with a graph aligner and exploits aggregation contamination via a dual-branch architecture to guide adaptation without labeled data. A key idea is to use an aggregation-free auxiliary branch and an aggregation estimator to measure and mitigate contamination, enabling robust generalization to unseen normal patterns. Experimental results across 10 datasets show that TUNE improves generalization over state-of-the-art GAD methods and graph TTA baselines, with better scalability for large graphs, highlighting practical impact for dynamic, real-world graphs where normal behavior evolves.

Abstract

Graph anomaly detection (GAD), which aims to detect outliers in graph-structured data, has received increasing research attention recently. However, existing GAD methods assume identical training and testing distributions, which is rarely valid in practice. In real-world scenarios, unseen but normal samples may emerge during deployment, leading to a normality shift that degrades the performance of GAD models trained on the original data. Through empirical analysis, we reveal that the degradation arises from (1) semantic confusion, where unseen normal samples are misinterpreted as anomalies due to their novel patterns, and (2) aggregation contamination, where the representations of seen normal nodes are distorted by unseen normals through message aggregation. While retraining or fine-tuning GAD models could be a potential solution to the above challenges, the high cost of model retraining and the difficulty of obtaining labeled data often render this approach impractical in real-world applications. To bridge the gap, we proposed a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns (TUNE) in GAD. To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level. Moreover, we utilize the minimization of representation-level shift as a supervision signal to train the aligner, which leverages the estimated aggregation contamination as a key indicator of normality shift. Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.

Paper Structure

This paper contains 16 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: (a) Sketch maps of conventional GAD methods under normality shift and our solution. (b) Performance drop of BWGNN under data with normality shift.
  • Figure 2: Motivate experiments on BWGNN.
  • Figure 3: Overall framework of TUNE. TUNE addresses normality shift by leveraging a graph aligner and a dual-branch architecture. It captures aggregation contamination caused by unseen normals by measuring the discrepancy between representations from a main branch and an aggregation-free auxiliary branch. To ensure that the auxiliary branch provides contamination-free representations, an aggregation estimator is jointly trained with the aligner in an alternating manner, using high-confidence normal nodes.
  • Figure 4: Performance on Photo and Computers datasets.
  • Figure 5: t-SNE visualizations of node representations by BWGNN (before and after applying TUNE) on two datasets.