Table of Contents
Fetching ...

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Xibin Zhao, Hai Wan

TL;DR

This work tackles graph-based fraud detection under challenging heterophily and severe label imbalance. It introduces SEC-GFD, a dual-module framework with a spectrum-enhanced hybrid-pass filter and a local environmental constraint component, to leverage high-frequency information and local contextual cues while improving label utilization. Empirical results on four real-world datasets show SEC-GFD consistently outperforms both homophily-based GNNs and specialized fraud detectors, with notable gains on Amazon and large-scale datasets. The approach provides practical benefits by better exploiting spectral information and multi-hop environments in fraud graphs, and code is released for reproducibility.

Abstract

Graph-based fraud detection (GFD) can be regarded as a challenging semi-supervised node binary classification task. In recent years, Graph Neural Networks (GNN) have been widely applied to GFD, characterizing the anomalous possibility of a node by aggregating neighbor information. However, fraud graphs are inherently heterophilic, thus most of GNNs perform poorly due to their assumption of homophily. In addition, due to the existence of heterophily and class imbalance problem, the existing models do not fully utilize the precious node label information. To address the above issues, this paper proposes a semi-supervised GNN-based fraud detector SEC-GFD. This detector includes a hybrid filtering module and a local environmental constraint module, the two modules are utilized to solve heterophily and label utilization problem respectively. The first module starts from the perspective of the spectral domain, and solves the heterophily problem to a certain extent. Specifically, it divides the spectrum into various mixed-frequency bands based on the correlation between spectrum energy distribution and heterophily. Then in order to make full use of the node label information, a local environmental constraint module is adaptively designed. The comprehensive experimental results on four real-world fraud detection datasets denote that SEC-GFD outperforms other competitive graph-based fraud detectors. We release our code at https://github.com/Sunxkissed/SEC-GFD.

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

TL;DR

This work tackles graph-based fraud detection under challenging heterophily and severe label imbalance. It introduces SEC-GFD, a dual-module framework with a spectrum-enhanced hybrid-pass filter and a local environmental constraint component, to leverage high-frequency information and local contextual cues while improving label utilization. Empirical results on four real-world datasets show SEC-GFD consistently outperforms both homophily-based GNNs and specialized fraud detectors, with notable gains on Amazon and large-scale datasets. The approach provides practical benefits by better exploiting spectral information and multi-hop environments in fraud graphs, and code is released for reproducibility.

Abstract

Graph-based fraud detection (GFD) can be regarded as a challenging semi-supervised node binary classification task. In recent years, Graph Neural Networks (GNN) have been widely applied to GFD, characterizing the anomalous possibility of a node by aggregating neighbor information. However, fraud graphs are inherently heterophilic, thus most of GNNs perform poorly due to their assumption of homophily. In addition, due to the existence of heterophily and class imbalance problem, the existing models do not fully utilize the precious node label information. To address the above issues, this paper proposes a semi-supervised GNN-based fraud detector SEC-GFD. This detector includes a hybrid filtering module and a local environmental constraint module, the two modules are utilized to solve heterophily and label utilization problem respectively. The first module starts from the perspective of the spectral domain, and solves the heterophily problem to a certain extent. Specifically, it divides the spectrum into various mixed-frequency bands based on the correlation between spectrum energy distribution and heterophily. Then in order to make full use of the node label information, a local environmental constraint module is adaptively designed. The comprehensive experimental results on four real-world fraud detection datasets denote that SEC-GFD outperforms other competitive graph-based fraud detectors. We release our code at https://github.com/Sunxkissed/SEC-GFD.
Paper Structure (24 sections, 18 equations, 4 figures, 3 tables)

This paper contains 24 sections, 18 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Schematic diagram of fraud relationship in real scenarios.
  • Figure 2: The above image presents an overview of our model SEC-GFD, where (a) and (b) respectively demonstrate the details of the hybrid-pass filter module and local environmental constraint module.
  • Figure 3: Filters' performance with different heterophily edges' deletion ratio on Amazon and YelpChi. Fig (a) and Fig (c) denote deletion of heterophily edges on the entire graph, while Fig (b) and Fig (d) denote deletion of heterophily edges appearing in training graph.
  • Figure 4: Performance with different order C.