Table of Contents
Fetching ...

Grad: Guided Relation Diffusion Generation for Graph Augmentation in Graph Fraud Detection

Jie Yang, Rui Zhang, Ziyang Cheng, Dawei Cheng, Guang Yang, Bo Wang

TL;DR

Grad tackles Adaptive Camouflage in Graph Fraud Detection by combining a supervised graph contrastive learning module with a DDPM-inspired guided relation diffusion generator to produce homophilic auxiliary relations. These relations amplify weak fraudulent signals during aggregation, while a diffusion-based relation augmentation and multi-relational detector robustly capture anomalies. Extensive offline and online experiments on WeChat Pay data and public benchmarks show Grad consistently surpassing state-of-the-art methods in AUC and AP, especially under camouflage-heavy scenarios. The approach offers a scalable, production-ready pipeline with practical deployment details and demonstrates strong potential for real-world online payment security.

Abstract

Nowadays, Graph Fraud Detection (GFD) in financial scenarios has become an urgent research topic to protect online payment security. However, as organized crime groups are becoming more professional in real-world scenarios, fraudsters are employing more sophisticated camouflage strategies. Specifically, fraudsters disguise themselves by mimicking the behavioral data collected by platforms, ensuring that their key characteristics are consistent with those of benign users to a high degree, which we call Adaptive Camouflage. Consequently, this narrows the differences in behavioral traits between them and benign users within the platform's database, thereby making current GFD models lose efficiency. To address this problem, we propose a relation diffusion-based graph augmentation model Grad. In detail, Grad leverages a supervised graph contrastive learning module to enhance the fraud-benign difference and employs a guided relation diffusion generator to generate auxiliary homophilic relations from scratch. Based on these, weak fraudulent signals would be enhanced during the aggregation process, thus being obvious enough to be captured. Extensive experiments have been conducted on two real-world datasets provided by WeChat Pay, one of the largest online payment platforms with billions of users, and three public datasets. The results show that our proposed model Grad outperforms SOTA methods in both various scenarios, achieving at most 11.10% and 43.95% increases in AUC and AP, respectively. Our code is released at https://github.com/AI4Risk/antifraud and https://github.com/Muyiiiii/WWW25-Grad.

Grad: Guided Relation Diffusion Generation for Graph Augmentation in Graph Fraud Detection

TL;DR

Grad tackles Adaptive Camouflage in Graph Fraud Detection by combining a supervised graph contrastive learning module with a DDPM-inspired guided relation diffusion generator to produce homophilic auxiliary relations. These relations amplify weak fraudulent signals during aggregation, while a diffusion-based relation augmentation and multi-relational detector robustly capture anomalies. Extensive offline and online experiments on WeChat Pay data and public benchmarks show Grad consistently surpassing state-of-the-art methods in AUC and AP, especially under camouflage-heavy scenarios. The approach offers a scalable, production-ready pipeline with practical deployment details and demonstrates strong potential for real-world online payment security.

Abstract

Nowadays, Graph Fraud Detection (GFD) in financial scenarios has become an urgent research topic to protect online payment security. However, as organized crime groups are becoming more professional in real-world scenarios, fraudsters are employing more sophisticated camouflage strategies. Specifically, fraudsters disguise themselves by mimicking the behavioral data collected by platforms, ensuring that their key characteristics are consistent with those of benign users to a high degree, which we call Adaptive Camouflage. Consequently, this narrows the differences in behavioral traits between them and benign users within the platform's database, thereby making current GFD models lose efficiency. To address this problem, we propose a relation diffusion-based graph augmentation model Grad. In detail, Grad leverages a supervised graph contrastive learning module to enhance the fraud-benign difference and employs a guided relation diffusion generator to generate auxiliary homophilic relations from scratch. Based on these, weak fraudulent signals would be enhanced during the aggregation process, thus being obvious enough to be captured. Extensive experiments have been conducted on two real-world datasets provided by WeChat Pay, one of the largest online payment platforms with billions of users, and three public datasets. The results show that our proposed model Grad outperforms SOTA methods in both various scenarios, achieving at most 11.10% and 43.95% increases in AUC and AP, respectively. Our code is released at https://github.com/AI4Risk/antifraud and https://github.com/Muyiiiii/WWW25-Grad.

Paper Structure

This paper contains 29 sections, 19 equations, 4 figures, 3 tables, 3 algorithms.

Figures (4)

  • Figure 1: The analysis of features in benign nodes and fraudulent nodes with Adaptive Camouflage. (A, B): the visualization of the feature embeddings of original and after-Grad nodes. (C, D): the similarity distribution between benign and fraudulent nodes and their neighbors.
  • Figure 2: The framework of Grad with five main components: (A) Node Group Sampler splits the entire graph into equal-sized non-overlapping subgraphs; (B) Supervised Graph Contrastive Learning (GCL) module enhances fraud-benign difference based on valuable labels; (C) Guided Relation Diffusion Generator generates homophilic auxiliary relations from scratch; (D) Relation Augmentation module provides extra global information of the entire financial transaction networks; (E) Multi-Relation Detector fuses multiple relations and detects fraudulent signals for accurate fraud detection.
  • Figure 3: Parameter sensitivity analysis on the node group size $k$, the gradient scale $s$, and the total sample steps $T$. The left sub-figures present the results of experiments conducted on YelpChi, while the right shows the results on Amazon.
  • Figure 4: Comparation on original and Grad-generated YelpChi and WeChat Pay-Large datasets.