Table of Contents
Fetching ...

Needles in a haystack: using forensic network science to uncover insider trading

Gian Jaeger, Wang Ngai Yeung, Renaud Lambiotte

TL;DR

Detecting insider trading is challenging due to scarce labelled data. The authors propose a forensic network approach that constructs a time aligned, edge weighted network from 2.9 million SEC Form 4 trades and analyzes centrality and anomalous egonets to identify coordinated insider behavior. They validate the network with two null models, a calibrated generative model and a constrained temporal shuffle, demonstrating that observed coordination exceeds chance and identifying clusters including multi family groups. This scalable, unsupervised screening framework provides investigators with a practical tool to prioritize follow ups in unlabelled data and complements traditional event based anomaly detection.

Abstract

Although the automation and digitisation of anti-financial crime investigation has made significant progress in recent years, detecting insider trading remains a unique challenge, partly due to the limited availability of labelled data. To address this challenge, we propose using a data-driven networks approach that flags groups of corporate insiders who report coordinated transactions that are indicative of insider trading. Specifically, we leverage data on 2.9 million trades reported to the U.S. Securities and Exchange Commission (SEC) by company insiders (C-suite executives, board members and major shareholders) between 2014 and 2024. Our proposed algorithm constructs weighted edges between insiders based on the temporal similarity of their trades over the 10-year timeframe. Within this network we then uncover trends that indicate insider trading by focusing on central nodes and anomalous subgraphs. To highlight the validity of our approach we evaluate our findings with reference to two null models, generated by running our algorithm on synthetic empirically calibrated and shuffled datasets. The results indicate that our approach can be used to detect pairs or clusters of insiders whose behaviour suggests insider trading and/or market manipulation.

Needles in a haystack: using forensic network science to uncover insider trading

TL;DR

Detecting insider trading is challenging due to scarce labelled data. The authors propose a forensic network approach that constructs a time aligned, edge weighted network from 2.9 million SEC Form 4 trades and analyzes centrality and anomalous egonets to identify coordinated insider behavior. They validate the network with two null models, a calibrated generative model and a constrained temporal shuffle, demonstrating that observed coordination exceeds chance and identifying clusters including multi family groups. This scalable, unsupervised screening framework provides investigators with a practical tool to prioritize follow ups in unlabelled data and complements traditional event based anomaly detection.

Abstract

Although the automation and digitisation of anti-financial crime investigation has made significant progress in recent years, detecting insider trading remains a unique challenge, partly due to the limited availability of labelled data. To address this challenge, we propose using a data-driven networks approach that flags groups of corporate insiders who report coordinated transactions that are indicative of insider trading. Specifically, we leverage data on 2.9 million trades reported to the U.S. Securities and Exchange Commission (SEC) by company insiders (C-suite executives, board members and major shareholders) between 2014 and 2024. Our proposed algorithm constructs weighted edges between insiders based on the temporal similarity of their trades over the 10-year timeframe. Within this network we then uncover trends that indicate insider trading by focusing on central nodes and anomalous subgraphs. To highlight the validity of our approach we evaluate our findings with reference to two null models, generated by running our algorithm on synthetic empirically calibrated and shuffled datasets. The results indicate that our approach can be used to detect pairs or clusters of insiders whose behaviour suggests insider trading and/or market manipulation.

Paper Structure

This paper contains 35 sections, 29 equations, 7 figures, 7 tables, 1 algorithm.

Figures (7)

  • Figure 1: Frequency distribution of connected components by size in the network.
  • Figure 2: Distribution of key network metrics from $10^3$ simulations of two null models compared to the empirical network. Panels show (a) number of nodes, (b) number of edges, (c) number of connected components, and (d) ultra-strong ties (edges with similarity $> 0.9$). The $//$ on the x-axis denotes a discontinuity in the scale, used to visualise the tightly concentrated null distributions alongside the distant empirical values.
  • Figure 3: Egonets of the two most central insiders measured via closeness centrality. Node colours represent company affiliation, and edge colours capture edge weights, with yellow indicating strong ties, green medium-to-strong ties, blue moderate ties, and purple weak ties.
  • Figure 4: Egonets of the two most central insiders measured via eigenvector centrality. Node colours represent company affiliation, and edge colours capture edge weights, with yellow indicating strong ties, green medium-to-strong ties, blue moderate ties, and purple weak ties. The red node denotes the individual under consideration.
  • Figure 5: Egonets of the four most anomalous individuals identified using the OddBall algorithm. Node colours represent company affiliation, and edge colours capture edge weights, with yellow indicating strong ties, green medium-to-strong ties, blue moderate ties, and purple weak ties. The red node denotes the individual under consideration.
  • ...and 2 more figures