Table of Contents
Fetching ...

Balanced Edge Pruning for Graph Anomaly Detection with Noisy Labels

Zhu Wang, Junnan Dong, Shuang Zhou, Chang Yang, Shengjie Zhao, Xiao Huang

TL;DR

REinforced Graph Anomaly Detector (REGAD) is proposed, which pruning the edges of candidate nodes potentially with mistaken labels by pruning the edges of candidate nodes potentially with mistaken labels to perform effective GAD with noisy labels.

Abstract

Graph anomaly detection (GAD) is widely applied in many areas, such as financial fraud detection and social spammer detection. Anomalous nodes in the graph not only impact their own communities but also create a ripple effect on neighbors throughout the graph structure. Detecting anomalous nodes in complex graphs has been a challenging task. While existing GAD methods assume all labels are correct, real-world scenarios often involve inaccurate annotations. These noisy labels can severely degrade GAD performance because, with anomalies representing a minority class, even a small number of mislabeled instances can disproportionately interfere with detection models. Cutting edges to mitigate the negative effects of noisy labels is a good option; however, it has both positive and negative influences and also presents an issue of weak supervision. To perform effective GAD with noisy labels, we propose REinforced Graph Anomaly Detector (REGAD) by pruning the edges of candidate nodes potentially with mistaken labels. Moreover, we design the performance feedback based on strategically crafted confident labels to guide the cutting process, ensuring optimal results. Specifically, REGAD contains two novel components. (i) A tailored policy network, which involves two-step actions to remove negative effect propagation step by step. (ii) A policy-in-the-loop mechanism to identify suitable edge removal strategies that control the propagation of noise on the graph and estimate the updated structure to obtain reliable pseudo labels iteratively. Experiments on three real-world datasets demonstrate that REGAD outperforms all baselines under different noisy ratios.

Balanced Edge Pruning for Graph Anomaly Detection with Noisy Labels

TL;DR

REinforced Graph Anomaly Detector (REGAD) is proposed, which pruning the edges of candidate nodes potentially with mistaken labels by pruning the edges of candidate nodes potentially with mistaken labels to perform effective GAD with noisy labels.

Abstract

Graph anomaly detection (GAD) is widely applied in many areas, such as financial fraud detection and social spammer detection. Anomalous nodes in the graph not only impact their own communities but also create a ripple effect on neighbors throughout the graph structure. Detecting anomalous nodes in complex graphs has been a challenging task. While existing GAD methods assume all labels are correct, real-world scenarios often involve inaccurate annotations. These noisy labels can severely degrade GAD performance because, with anomalies representing a minority class, even a small number of mislabeled instances can disproportionately interfere with detection models. Cutting edges to mitigate the negative effects of noisy labels is a good option; however, it has both positive and negative influences and also presents an issue of weak supervision. To perform effective GAD with noisy labels, we propose REinforced Graph Anomaly Detector (REGAD) by pruning the edges of candidate nodes potentially with mistaken labels. Moreover, we design the performance feedback based on strategically crafted confident labels to guide the cutting process, ensuring optimal results. Specifically, REGAD contains two novel components. (i) A tailored policy network, which involves two-step actions to remove negative effect propagation step by step. (ii) A policy-in-the-loop mechanism to identify suitable edge removal strategies that control the propagation of noise on the graph and estimate the updated structure to obtain reliable pseudo labels iteratively. Experiments on three real-world datasets demonstrate that REGAD outperforms all baselines under different noisy ratios.
Paper Structure (23 sections, 14 equations, 7 figures, 5 tables)

This paper contains 23 sections, 14 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: A pilot study reveals that noisy labels degrade the performance of general GAD models. The bars with cross lines denote the results when $0.1\%$ incorrect labels are injected. AUC denotes the metric of the area under the ROC curve.
  • Figure 2: An overview of our framework REGAD with the policy-in-the-loop mechanism. The edge pruner explores MDP to obtain a strategy to cut edges and balance the pruned edge quality and quantity by $\pi_{\theta}$ based on the performance improvement. The reward design employs the trustworthy pseudo labels of confident sets from the base detector, i.e., $\mathcal{AS}$ and $\mathcal{NS}$, to guide the reward computation.
  • Figure 3: Possible scenarios resulting from improper pruning edges of targets.
  • Figure 4: Impact analysis of different noisy label ratios among the same labeled anomalous nodes estimated by AUC and AUPR.
  • Figure 5: Impact analysis of the labeled anomalous node ratios with the same noisy label percentage estimated by AUC and AUPR.
  • ...and 2 more figures