Table of Contents
Fetching ...

Distributed Monitoring for Data Distribution Shifts in Edge-ML Fraud Detection

Nader Karayanni, Robert J. Shahla, Chieh-Lien Hsiao

TL;DR

The paper tackles data distribution shift in edge-ML fraud detection by proposing an open-source framework that continuously monitors drift across a network of edge devices using a distributed Kolmogorov-Smirnov statistic computed from per-edge $t$-digests. A Python-based client-server architecture enables compact, mergeable representations of local distributions and serverless backend aggregation, minimizing bandwidth while delivering accurate KS estimates $KS(F_1,F_2)=\sup_x|F_1(x)-F_2(x)|$. Extensive experiments on real-world and synthetic financial datasets demonstrate that the distributed approach (T-Digest-KS) closely matches the fully centralized (Optimal-KS) KS with median errors below $0.004$ and maintain low false-positive/false-negative rates, while offering scalable backend performance and reduced client overhead. The work advances practical, privacy-conscious monitoring for edge ML fraud systems and lays groundwork for future privacy-preserving and sliding-window extensions in holistic edge monitoring frameworks.

Abstract

The digital era has seen a marked increase in financial fraud. edge ML emerged as a promising solution for smartphone payment services fraud detection, enabling the deployment of ML models directly on edge devices. This approach enables a more personalized real-time fraud detection. However, a significant gap in current research is the lack of a robust system for monitoring data distribution shifts in these distributed edge ML applications. Our work bridges this gap by introducing a novel open-source framework designed for continuous monitoring of data distribution shifts on a network of edge devices. Our system includes an innovative calculation of the Kolmogorov-Smirnov (KS) test over a distributed network of edge devices, enabling efficient and accurate monitoring of users behavior shifts. We comprehensively evaluate the proposed framework employing both real-world and synthetic financial transaction datasets and demonstrate the framework's effectiveness.

Distributed Monitoring for Data Distribution Shifts in Edge-ML Fraud Detection

TL;DR

The paper tackles data distribution shift in edge-ML fraud detection by proposing an open-source framework that continuously monitors drift across a network of edge devices using a distributed Kolmogorov-Smirnov statistic computed from per-edge -digests. A Python-based client-server architecture enables compact, mergeable representations of local distributions and serverless backend aggregation, minimizing bandwidth while delivering accurate KS estimates . Extensive experiments on real-world and synthetic financial datasets demonstrate that the distributed approach (T-Digest-KS) closely matches the fully centralized (Optimal-KS) KS with median errors below and maintain low false-positive/false-negative rates, while offering scalable backend performance and reduced client overhead. The work advances practical, privacy-conscious monitoring for edge ML fraud systems and lays groundwork for future privacy-preserving and sliding-window extensions in holistic edge monitoring frameworks.

Abstract

The digital era has seen a marked increase in financial fraud. edge ML emerged as a promising solution for smartphone payment services fraud detection, enabling the deployment of ML models directly on edge devices. This approach enables a more personalized real-time fraud detection. However, a significant gap in current research is the lack of a robust system for monitoring data distribution shifts in these distributed edge ML applications. Our work bridges this gap by introducing a novel open-source framework designed for continuous monitoring of data distribution shifts on a network of edge devices. Our system includes an innovative calculation of the Kolmogorov-Smirnov (KS) test over a distributed network of edge devices, enabling efficient and accurate monitoring of users behavior shifts. We comprehensively evaluate the proposed framework employing both real-world and synthetic financial transaction datasets and demonstrate the framework's effectiveness.
Paper Structure (22 sections, 1 equation, 7 figures)

This paper contains 22 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: KS-statistic visualization enwiki:ks.
  • Figure 2: Framework overview
  • Figure 3: Handling queue message.
  • Figure 4: Real-world dataset accuracy with different shifts.
  • Figure 5: Accuracy with a percentage of shifted users.
  • ...and 2 more figures