Table of Contents
Fetching ...

Partitioning Message Passing for Graph Fraud Detection

Wei Zhuo, Zemin Liu, Bryan Hooi, Bingsheng He, Guang Tan, Rizal Fathony, Jia Chen

TL;DR

Partitioning Message Passing (PMP) tackles Graph Fraud Detection (GFD) by distinguishing neighbors by class during message passing instead of pruning edges. It introduces node-specific, class-aware aggregations with root-specific weight generators and an adaptive blend for unlabeled neighbors, effectively producing node-specific spectral filters. Theoretical analysis shows PMP implements an adaptive spectral convolution per node, enabling robust handling of graphs with mixed homophily/heterophily and severe label imbalance. Empirical results across Yelp, Amazon, T-Finance, T-Social, and Grab demonstrate state-of-the-art performance in both supervised and semi-supervised settings, with consistent gains over generic GNNs and specialized GFD methods, underscoring its practicality and scalability.

Abstract

Label imbalance and homophily-heterophily mixture are the fundamental problems encountered when applying Graph Neural Networks (GNNs) to Graph Fraud Detection (GFD) tasks. Existing GNN-based GFD models are designed to augment graph structure to accommodate the inductive bias of GNNs towards homophily, by excluding heterophilic neighbors during message passing. In our work, we argue that the key to applying GNNs for GFD is not to exclude but to {\em distinguish} neighbors with different labels. Grounded in this perspective, we introduce Partitioning Message Passing (PMP), an intuitive yet effective message passing paradigm expressly crafted for GFD. Specifically, in the neighbor aggregation stage of PMP, neighbors with different classes are aggregated with distinct node-specific aggregation functions. By this means, the center node can adaptively adjust the information aggregated from its heterophilic and homophilic neighbors, thus avoiding the model gradient being dominated by benign nodes which occupy the majority of the population. We theoretically establish a connection between the spatial formulation of PMP and spectral analysis to characterize that PMP operates an adaptive node-specific spectral graph filter, which demonstrates the capability of PMP to handle heterophily-homophily mixed graphs. Extensive experimental results show that PMP can significantly boost the performance on GFD tasks.

Partitioning Message Passing for Graph Fraud Detection

TL;DR

Partitioning Message Passing (PMP) tackles Graph Fraud Detection (GFD) by distinguishing neighbors by class during message passing instead of pruning edges. It introduces node-specific, class-aware aggregations with root-specific weight generators and an adaptive blend for unlabeled neighbors, effectively producing node-specific spectral filters. Theoretical analysis shows PMP implements an adaptive spectral convolution per node, enabling robust handling of graphs with mixed homophily/heterophily and severe label imbalance. Empirical results across Yelp, Amazon, T-Finance, T-Social, and Grab demonstrate state-of-the-art performance in both supervised and semi-supervised settings, with consistent gains over generic GNNs and specialized GFD methods, underscoring its practicality and scalability.

Abstract

Label imbalance and homophily-heterophily mixture are the fundamental problems encountered when applying Graph Neural Networks (GNNs) to Graph Fraud Detection (GFD) tasks. Existing GNN-based GFD models are designed to augment graph structure to accommodate the inductive bias of GNNs towards homophily, by excluding heterophilic neighbors during message passing. In our work, we argue that the key to applying GNNs for GFD is not to exclude but to {\em distinguish} neighbors with different labels. Grounded in this perspective, we introduce Partitioning Message Passing (PMP), an intuitive yet effective message passing paradigm expressly crafted for GFD. Specifically, in the neighbor aggregation stage of PMP, neighbors with different classes are aggregated with distinct node-specific aggregation functions. By this means, the center node can adaptively adjust the information aggregated from its heterophilic and homophilic neighbors, thus avoiding the model gradient being dominated by benign nodes which occupy the majority of the population. We theoretically establish a connection between the spatial formulation of PMP and spectral analysis to characterize that PMP operates an adaptive node-specific spectral graph filter, which demonstrates the capability of PMP to handle heterophily-homophily mixed graphs. Extensive experimental results show that PMP can significantly boost the performance on GFD tasks.

Paper Structure

This paper contains 32 sections, 1 theorem, 12 equations, 7 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

Consider an undirected graph $\mathcal{G}$, let $\mathbf{L} = \mathbf{U}\mathbf{\Lambda}\mathbf{U}^\top$ represent the eigendecomposition of the symmetric normalized Laplacian $\mathbf{L} = \mathbf{I}-\mathbf{D}^{-1 / 2} \mathbf{A} \mathbf{D}^{-1 / 2}$, where $\mathbf{U}$ is the matrix of eigenvecto where the spectral convolution filters are diagonal matrices defined as: where $\mathcal{N}_{\text

Figures (7)

  • Figure 1: Comparison between generic message passing GNN and partitioning message passing GNN. Red, blue and grey mean fraud, benign and unlabeled nodes, respectively.
  • Figure 2: (a) A Barabási–Albert graph with 500 nodes, with 10% nodes are fraud. Features of benign nodes (depicted in blue) follow a Gaussian distribution, $\mathscr{N}(1,1)$, while those of fraud nodes (shown in red) are drawn from $\mathscr{N}(5,1)$. (b) Spectral convolution filters of PMP for a random node $v_i$.
  • Figure 3: AUC vs. testing time on T-Social.
  • Figure 4: Influence distribution.
  • Figure 5: Label distributions of the labeled neighborhoods of the training nodes.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof