Table of Contents
Fetching ...

Detecting Complex Money Laundering Patterns with Incremental and Distributed Graph Modeling

Haseeb Tariq, Alen Kaja, Marwan Hassani

Abstract

Money launderers take advantage of limitations in existing detection approaches by hiding their financial footprints in a deceitful manner. They manage this by replicating transaction patterns that the monitoring systems cannot easily distinguish. As a result, criminally gained assets are pushed into legitimate financial channels without drawing attention. Algorithms developed to monitor money flows often struggle with scale and complexity. The difficulty of identifying such activities is further intensified by the (persistent) inability of current solutions to control the excessive number of false positive signals produced by rigid, risk-based rules systems. We propose a framework called ReDiRect (REduce, DIstribute, and RECTify), specifically designed to overcome these challenges. The primary contribution of our work is a novel framing of this problem in an unsupervised setting; where a large transaction graph is fuzzily partitioned into smaller, manageable components to enable fast processing in a distributed manner. In addition, we define a refined evaluation metric that better captures the effectiveness of exposed money laundering patterns. Through comprehensive experimentation, we demonstrate that our framework achieves superior performance compared to existing and state-of-the-art techniques, particularly in terms of efficiency and real-world applicability. For validation, we used the real (open source) Libra dataset and the recently released synthetic datasets by IBM Watson. Our code and datasets are available at https://github.com/mhaseebtariq/redirect.

Detecting Complex Money Laundering Patterns with Incremental and Distributed Graph Modeling

Abstract

Money launderers take advantage of limitations in existing detection approaches by hiding their financial footprints in a deceitful manner. They manage this by replicating transaction patterns that the monitoring systems cannot easily distinguish. As a result, criminally gained assets are pushed into legitimate financial channels without drawing attention. Algorithms developed to monitor money flows often struggle with scale and complexity. The difficulty of identifying such activities is further intensified by the (persistent) inability of current solutions to control the excessive number of false positive signals produced by rigid, risk-based rules systems. We propose a framework called ReDiRect (REduce, DIstribute, and RECTify), specifically designed to overcome these challenges. The primary contribution of our work is a novel framing of this problem in an unsupervised setting; where a large transaction graph is fuzzily partitioned into smaller, manageable components to enable fast processing in a distributed manner. In addition, we define a refined evaluation metric that better captures the effectiveness of exposed money laundering patterns. Through comprehensive experimentation, we demonstrate that our framework achieves superior performance compared to existing and state-of-the-art techniques, particularly in terms of efficiency and real-world applicability. For validation, we used the real (open source) Libra dataset and the recently released synthetic datasets by IBM Watson. Our code and datasets are available at https://github.com/mhaseebtariq/redirect.

Paper Structure

This paper contains 21 sections, 5 equations, 5 figures, 7 tables, 2 algorithms.

Figures (5)

  • Figure 1: Conceptualization of the problem formulation for ReDiRect. Left side represents the actual data space; in the next part, nodes are filtered out, either by the primary detection system; or by the output of an earlier ReDiRect run. Finally, the fuzzy communities (per node) are labeled as suspicious by an unsupervised machine learning model.
  • Figure 2: Example alerted flow: The yellow are extra (or false-positive); and the red are missing (or false-negative) nodes. The contextual completeness would be → 7 (correctly alerted context) / 11 (alerted with missing context) = 0.64.
  • Figure 3: TPR AUC plots comparing the initial run of ReDiRect (no reduction in data), with the outputs of RM-2 (10% reduction) and RM-3 (12.5% reduction) methods.
  • Figure 4: The execution times for each of the heavy duty tasks in the framework, with increasing number of nodes in the $\mathcal{D}_{lib}^{real}$ dataset.
  • Figure 5: Execution times for $\mathcal{D}_{ibm}^{syn}$ large dataset, with increasing level of distribution.