Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

Jiaxin Jiang; Siyuan Yao; Yuchen Li; Qiange Wang; Bingsheng He; Min Chen

Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

Jiaxin Jiang, Siyuan Yao, Yuchen Li, Qiange Wang, Bingsheng He, Min Chen

TL;DR

Fraud detection on billion-scale graphs requires scalable Densest Subgraph Discovery. Dupin provides a generic parallel peeling framework with global and local pruning to accelerate DSD across multiple density metrics (DG, DW, FD, TDS, kCLiDS) while preserving approximation guarantees. Theoretical results show a k(1+ε)-approximation with a logarithmic bound on peeling rounds, and empirical results demonstrate up to 100x faster detection and fraud-prevention gains up to 94.5% on real-world, large-scale graphs. Dupin’s architecture, APIs, and long-tail pruning enable flexible, real-time fraud analytics, making it a practical tool for production fraud-detection pipelines.

Abstract

Detecting fraudulent activities in financial and e-commerce transaction networks is crucial. One effective method for this is Densest Subgraph Discovery (DSD). However, deploying DSD methods in production systems faces substantial scalability challenges due to the predominantly sequential nature of existing methods, which impedes their ability to handle large-scale transaction networks and results in significant detection delays. To address these challenges, we introduce Dupin, a novel parallel processing framework designed for efficient DSD processing in billion-scale graphs. Dupin is powered by a processing engine that exploits the unique properties of the peeling process, with theoretical guarantees on detection quality and efficiency. Dupin provides userfriendly APIs for flexible customization of DSD objectives and ensures robust adaptability to diverse fraud detection scenarios. Empirical evaluations demonstrate that Dupin consistently outperforms several existing DSD methods, achieving performance improvements of up to 100 times compared to traditional approaches. On billion-scale graphs, Dupin demonstrates the potential to enhance the prevention of fraudulent transactions from 45% to 94.5% and reduces density error from 30% to below 5%, as supported by our experimental results. These findings highlight the effectiveness of Dupin in real-world applications, ensuring both speed and accuracy in fraud detection.

Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

TL;DR

Abstract

Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (12)