Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection
Aydin Abadi, Bradley Doyle, Francesco Gini, Kieron Guinamard, Sasi Kumar Murakonda, Jack Liddell, Paul Mellor, Steven J. Murdoch, Mohammad Naseri, Hector Page, George Theodorakopoulos, Suzanne Weller
TL;DR
Starlit presents a scalable privacy-preserving federated learning framework for cross-institutional financial fraud detection, combining Private Set Intersection, Local Differential Privacy, and SecureBoost within the Flower platform to operate on data partitioned both horizontally and vertically. It provides a formal security definition (Celestial) and a simulation-based proof, along with a practical end-to-end implementation that tolerates client dropouts and does not require prior account freezing. The system introduces two novel capabilities: securely identifying discrepancies in shared features across clients and aggregating per-user flags with privacy protections, enabling richer features for anomaly detection. Empirical results on synthetic, large-scale financial datasets demonstrate Starlit's scalability, efficiency, and competitive accuracy, with broader applicability to terrorism mitigation, digital health, and benefit-fraud detection. The work offers a blueprint for privacy-preserving, multi-party collaboration in regulated domains, showing that rigorous security can coexist with practical performance.
Abstract
Federated Learning (FL) is a data-minimization approach enabling collaborative model training across diverse clients with local data, avoiding direct data exchange. However, state-of-the-art FL solutions to identify fraudulent financial transactions exhibit a subset of the following limitations. They (1) lack a formal security definition and proof, (2) assume prior freezing of suspicious customers' accounts by financial institutions (limiting the solutions' adoption), (3) scale poorly, involving either $O(n^2)$ computationally expensive modular exponentiation (where $n$ is the total number of financial institutions) or highly inefficient fully homomorphic encryption, (4) assume the parties have already completed the identity alignment phase, hence excluding it from the implementation, performance evaluation, and security analysis, and (5) struggle to resist clients' dropouts. This work introduces Starlit, a novel scalable privacy-preserving FL mechanism that overcomes these limitations. It has various applications, such as enhancing financial fraud detection, mitigating terrorism, and enhancing digital health. We implemented Starlit and conducted a thorough performance analysis using synthetic data from a key player in global financial transactions. The evaluation indicates Starlit's scalability, efficiency, and accuracy.
