Crowdsourcing Fraud Detection over Heterogeneous Temporal MMMA Graph
Zequan Xu, Qihang Sun, Shaofeng Hu, Jieming Shi, Hui Li
TL;DR
Crowdsourcing fraud in MMMAs presents challenges due to data heterogeneity, dynamics, and limited supervision. The authors propose CMT, a Contrastive Multi-view Learning framework on Heterogeneous Temporal Graphs that integrates HG-Encoder for heterogeneity, dual history views (Temporal Snapshot and User Relation sequences), data augmentation, and a Transformer-based Contrastive Sequence Encoder to learn robust representations in a self-supervised manner. Pretraining with contrastive and binary objectives plus a downstream detector yields state-of-the-art results on industry-scale WeChat data and transferable gains on FinGraph, while revealing actionable fraud patterns. This approach advances graph anomaly detection by jointly modeling multi-relational structure and temporal evolution under limited labels, with practical implications for large-scale fraud monitoring in MMMAs and beyond.
Abstract
The rise of the click farm business using Multi-purpose Messaging Mobile Apps (MMMAs) tempts cybercriminals to perpetrate crowdsourcing frauds that cause financial losses to click farm workers. In this paper, we propose a novel contrastive multi-view learning method named CMT for crowdsourcing fraud detection over the heterogeneous temporal graph (HTG) of MMMA. CMT captures both heterogeneity and dynamics of HTG and generates high-quality representations for crowdsourcing fraud detection in a self-supervised manner. We deploy CMT to detect crowdsourcing frauds on an industry-size HTG of a representative MMMA WeChat and it significantly outperforms other methods. CMT also shows promising results for fraud detection on a large-scale public financial HTG, indicating that it can be applied in other graph anomaly detection tasks. We provide our implementation at https://github.com/KDEGroup/CMT.
