Table of Contents
Fetching ...

Garbage in Garbage out: Impacts of data quality on criminal network intervention

Wang Ngai Yeung, Riccardo Di Clemente, Renaud Lambiotte

TL;DR

The paper investigates how data quality shapes the effectiveness of network interventions in criminal networks, showing that missing data and self-organization amplify robustness, especially in decentralized structures. It combines percolation-based disruption with classical centrality, heuristic, and learning-based attacks across multiple real and synthetic networks, revealing that data incompleteness can severely undermine intervention efficacy while inaccuracies have more nuanced effects. A key finding is that ROAM can markedly boost robustness in centralized networks by decentralizing structure, underscoring the risk of leader-hiding tactics in evolving illicit networks. The authors advocate for interoperable intelligence ecosystems and advanced network-inference techniques to mitigate data-quality challenges and improve disruption outcomes in practical Law Enforcement contexts.

Abstract

Criminal networks such as human trafficking rings are threats to the rule of law, democracy and public safety in our global society. Network science provides invaluable tools to identify key players and design interventions for Law Enforcement Agencies (LEAs), e.g., to dismantle their organisation. However, poor data quality and the adaptiveness of criminal networks through self-organization make effective disruption extremely challenging. Although there exists a large body of work building and applying network scientific tools to attack criminal networks, these work often implicitly assume that the network measurements are accurate and complete. Moreover, there is thus far no comprehensive understanding of the impacts of data quality on the downstream effectiveness of interventions. This work investigates the relationship between data quality and intervention effectiveness based on classical graph theoretic and machine learning-based approaches. Decentralization emerges as a major factor in network robustness, particularly under conditions of incomplete data, which renders attack strategies largely ineffective. Moreover, the robustness of centralized networks can be boosted using simple heuristics, making targeted attack more infeasible. Consequently, we advocate for a more cautious application of network science in disrupting criminal networks, the continuous development of an interoperable intelligence ecosystem, and the creation of novel network inference techniques to address data quality challenges.

Garbage in Garbage out: Impacts of data quality on criminal network intervention

TL;DR

The paper investigates how data quality shapes the effectiveness of network interventions in criminal networks, showing that missing data and self-organization amplify robustness, especially in decentralized structures. It combines percolation-based disruption with classical centrality, heuristic, and learning-based attacks across multiple real and synthetic networks, revealing that data incompleteness can severely undermine intervention efficacy while inaccuracies have more nuanced effects. A key finding is that ROAM can markedly boost robustness in centralized networks by decentralizing structure, underscoring the risk of leader-hiding tactics in evolving illicit networks. The authors advocate for interoperable intelligence ecosystems and advanced network-inference techniques to mitigate data-quality challenges and improve disruption outcomes in practical Law Enforcement contexts.

Abstract

Criminal networks such as human trafficking rings are threats to the rule of law, democracy and public safety in our global society. Network science provides invaluable tools to identify key players and design interventions for Law Enforcement Agencies (LEAs), e.g., to dismantle their organisation. However, poor data quality and the adaptiveness of criminal networks through self-organization make effective disruption extremely challenging. Although there exists a large body of work building and applying network scientific tools to attack criminal networks, these work often implicitly assume that the network measurements are accurate and complete. Moreover, there is thus far no comprehensive understanding of the impacts of data quality on the downstream effectiveness of interventions. This work investigates the relationship between data quality and intervention effectiveness based on classical graph theoretic and machine learning-based approaches. Decentralization emerges as a major factor in network robustness, particularly under conditions of incomplete data, which renders attack strategies largely ineffective. Moreover, the robustness of centralized networks can be boosted using simple heuristics, making targeted attack more infeasible. Consequently, we advocate for a more cautious application of network science in disrupting criminal networks, the continuous development of an interoperable intelligence ecosystem, and the creation of novel network inference techniques to address data quality challenges.
Paper Structure (14 sections, 4 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 4 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Criminal networks investigated in this work. A: Rank-Degree distribution fitted with Rank-Size scaling law $P_k = P_1 k^{-q}$, where $P_1$ denotes the highest degree and $k \in [1, 2, \cdots, N]$ refers to the rank. Rank is normalized for comparative purposes. Note that not all networks were well-fitted due to varying levels of network centralization. B: Visualization of the networks. C: Baseline percolation with shaded area is the standard deviation $\sigma$ and $\langle AUC \rangle$ is the average Area Under the Curve (AUC) of the LCC trajectories across all node-ranking methods.
  • Figure 2: Impact of data incompleteness on percolation effectiveness in the 'Ndrangheta network with $10^3$ simulations per node-ranking method across all data incompleteness scenarios. A: Boxplots of all node-ranking methods under different data completeness scenarios. Whiskers show the inter-quantile range (IQR) of the AUC and the outliers are indicated by small dots. B: Quantile regression of the effect of data completeness on AUC. C: Percolation plots of four different node-ranking methods under different data scenarios averaged over simulations.
  • Figure 3: Results of the percolation experiments on the ROAM-altered cocaine trafficking ring and the 'Ndrangheta network ($b=6, exec_n = 8$). A: Evolution of LCC under ROAM-altered network (blue line) against the original network (gray dotted line). Green area indicates the positive difference of AUC between the two trajectories. The trajectories are averaging over all measures as attack strategies. B: Change in AUC with different values of $b$ and $exec_n$C: Change in the four network statistics over varying $exec_n$ of ROAM.
  • Figure 4: Rank-biased overlap similarity of node-ranking methods used in the percolation experiment. Higher values indicate higher similarity between node-ranking methods.
  • Figure 5: One execution of the Remove-One-Attach-Many heuristic with a budget $b = 3$.