FRAUDGUESS: Spotting and Explaining New Types of Fraud in Million-Scale Financial Data
Robson L. F. Cordeiro, Meng-Chieh Lee, Christos Faloutsos
TL;DR
FraudGuess addresses the challenge of spotting novel fraud types in million-scale financial data while providing interpretable justification for analysts. It decomposes the problem into Detection (FraudGuess-D) that identifies micro-cluster lockstep behaviors using a curated feature set and heatmaps, and Justification (FraudGuess-J) that delivers an interactive dashboard and visual explanations. In real AFI data, FraudGuess identifies three new fraudulent patterns, with two confirmed by domain experts, and demonstrates scalable linear-time complexity suitable for large deployments. The approach emphasizes explainability over black-box models and outlines plans for deployment and reproducibility through open-source code and synthetic data.
Abstract
Given a set of financial transactions (who buys from whom, when, and for how much), as well as prior information from buyers and sellers, how can we find fraudulent transactions? If we have labels for some transactions for known types of fraud, we can build a classifier. However, we also want to find new types of fraud, still unknown to the domain experts ('Detection'). Moreover, we also want to provide evidence to experts that supports our opinion ('Justification'). In this paper, we propose FRAUDGUESS, to achieve two goals: (a) for 'Detection', it spots new types of fraud as micro-clusters in a carefully designed feature space; (b) for 'Justification', it uses visualization and heatmaps for evidence, as well as an interactive dashboard for deep dives. FRAUDGUESS is used in real life and is currently considered for deployment in an Anonymous Financial Institution (AFI). Thus, we also present the three new behaviors that FRAUDGUESS discovered in a real, million-scale financial dataset. Two of these behaviors are deemed fraudulent or suspicious by domain experts, catching hundreds of fraudulent transactions that would otherwise go un-noticed.
