Table of Contents
Fetching ...

Privacy-Preserving Federated Fraud Detection in Payment Transactions with NVIDIA FLARE

Holger R. Roth, Sarthak Tickoo, Mayank Kumar, Isaac Yang, Andrew Liu, Amit Varshney, Sayani Kundu, Iustina Vintila, Peter Madsgaard, Juraj Milcak, Chester Chen, Yan Cheng, Andrew Feng, Jeff Savio, Vikram Singh, Craig Stancill, Gloria Wan, Evan Powell, Anwar Ul Haq, Sudhir Upadhyay, Jisoo Lee

Abstract

Fraud-related financial losses continue to rise, while regulatory, privacy, and data-sovereignty constraints increasingly limit the feasibility of centralized fraud detection systems. Federated Learning (FL) has emerged as a promising paradigm for enabling collaborative model training across institutions without sharing raw transaction data. Yet, its practical effectiveness under realistic, non-IID financial data distributions remains insufficiently validated. In this work, we present a multi-institution, industry-oriented proof-of-concept study evaluating federated anomaly detection for payment transactions using the NVIDIA FLARE framework. We simulate a realistic federation of heterogeneous financial institutions, each observing distinct fraud typologies and operating under strict data isolation. Using a deep neural network trained via federated averaging (FedAvg), we demonstrate that federated models achieve a mean F1-score of 0.903 - substantially outperforming locally trained models (0.643) and closely approaching centralized training performance (0.925), while preserving full data sovereignty. We further analyze convergence behavior, showing that strong performance is achieved within 10 federated communication rounds, highlighting the operational viability of FL in latency- and cost-sensitive financial environments. To support deployment in regulated settings, we evaluate model interpretability using Shapley-based feature attribution and confirm that federated models rely on semantically coherent, domain-relevant decision signals. Finally, we incorporate sample-level differential privacy via DP-SGD and demonstrate favorable privacy-utility trade-offs...

Privacy-Preserving Federated Fraud Detection in Payment Transactions with NVIDIA FLARE

Abstract

Fraud-related financial losses continue to rise, while regulatory, privacy, and data-sovereignty constraints increasingly limit the feasibility of centralized fraud detection systems. Federated Learning (FL) has emerged as a promising paradigm for enabling collaborative model training across institutions without sharing raw transaction data. Yet, its practical effectiveness under realistic, non-IID financial data distributions remains insufficiently validated. In this work, we present a multi-institution, industry-oriented proof-of-concept study evaluating federated anomaly detection for payment transactions using the NVIDIA FLARE framework. We simulate a realistic federation of heterogeneous financial institutions, each observing distinct fraud typologies and operating under strict data isolation. Using a deep neural network trained via federated averaging (FedAvg), we demonstrate that federated models achieve a mean F1-score of 0.903 - substantially outperforming locally trained models (0.643) and closely approaching centralized training performance (0.925), while preserving full data sovereignty. We further analyze convergence behavior, showing that strong performance is achieved within 10 federated communication rounds, highlighting the operational viability of FL in latency- and cost-sensitive financial environments. To support deployment in regulated settings, we evaluate model interpretability using Shapley-based feature attribution and confirm that federated models rely on semantically coherent, domain-relevant decision signals. Finally, we incorporate sample-level differential privacy via DP-SGD and demonstrate favorable privacy-utility trade-offs...
Paper Structure (43 sections, 7 equations, 6 figures, 5 tables)

This paper contains 43 sections, 7 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Federated learning for collaborative fraud detection.
  • Figure 2: Data distributions across sites.A, Site-level fraud rates and aggregate amount statistics. B, Amount distributions for normal vs fraudulent transactions. C--D, Anomaly-type distributions for training and evaluation partitions.
  • Figure 3: Central vs. Local vs. FL: convergence and precision--recall performance.A, Mean binary F1-score across clients and anomaly types over federated rounds (shaded bands: s.d. across clients). B, Precision--recall curves evaluated across client test sets for each checkpointed model; solid lines show the mean and bands indicate $\pm$1 s.d. across clients. The no-skill baseline is included, and AUPRC values are reported in the legend.
  • Figure 4: Confusion matrices of the final FedAvg global model for each participating bank.
  • Figure 5: Feature importance plot based on Shapley-value–based attribution (Captum GradientShap).
  • ...and 1 more figures