Table of Contents
Fetching ...

Generative Dynamic Graph Representation Learning for Conspiracy Spoofing Detection

Sheng Xiang, Yidong Jiang, Yunting Chen, Dawei Cheng, Guoping Zhao, Changjun Jiang

TL;DR

This paper tackles conspiracy spoofing detection in financial markets by addressing irregular temporal dynamics and heterogeneous transaction relationships. It introduces Generative Dynamic Graph Model (GDGM), which encodes time-stamped trading data with Neural ODEs and GRUs, generates pseudo-labels via Beta wavelet graph learning, and employs intra- and inter-relations through a heterogeneous graph attention mechanism. The approach achieves superior performance on real-world spoofing datasets and is proven effective in a large-scale online deployment, validating its practical impact. The combination of continuous-time dynamics, pseudo-labeling, and multi-relational graph fusion offers a robust framework for detecting sophisticated spoofing patterns in dynamic markets.

Abstract

Spoofing detection in financial trading is crucial, especially for identifying complex behaviors such as conspiracy spoofing. Traditional machine-learning approaches primarily focus on isolated node features, often overlooking the broader context of interconnected nodes. Graph-based techniques, particularly Graph Neural Networks (GNNs), have advanced the field by leveraging relational information effectively. However, in real-world spoofing detection datasets, trading behaviors exhibit dynamic, irregular patterns. Existing spoofing detection methods, though effective in some scenarios, struggle to capture the complexity of dynamic and diverse, evolving inter-node relationships. To address these challenges, we propose a novel framework called the Generative Dynamic Graph Model (GDGM), which models dynamic trading behaviors and the relationships among nodes to learn representations for conspiracy spoofing detection. Specifically, our approach incorporates the generative dynamic latent space to capture the temporal patterns and evolving market conditions. Raw trading data is first converted into time-stamped sequences. Then we model trading behaviors using the neural ordinary differential equations and gated recurrent units, to generate the representation incorporating temporal dynamics of spoofing patterns. Furthermore, pseudo-label generation and heterogeneous aggregation techniques are employed to gather relevant information and enhance the detection performance for conspiratorial spoofing behaviors. Experiments conducted on spoofing detection datasets demonstrate that our approach outperforms state-of-the-art models in detection accuracy. Additionally, our spoofing detection system has been successfully deployed in one of the largest global trading markets, further validating the practical applicability and performance of the proposed method.

Generative Dynamic Graph Representation Learning for Conspiracy Spoofing Detection

TL;DR

This paper tackles conspiracy spoofing detection in financial markets by addressing irregular temporal dynamics and heterogeneous transaction relationships. It introduces Generative Dynamic Graph Model (GDGM), which encodes time-stamped trading data with Neural ODEs and GRUs, generates pseudo-labels via Beta wavelet graph learning, and employs intra- and inter-relations through a heterogeneous graph attention mechanism. The approach achieves superior performance on real-world spoofing datasets and is proven effective in a large-scale online deployment, validating its practical impact. The combination of continuous-time dynamics, pseudo-labeling, and multi-relational graph fusion offers a robust framework for detecting sophisticated spoofing patterns in dynamic markets.

Abstract

Spoofing detection in financial trading is crucial, especially for identifying complex behaviors such as conspiracy spoofing. Traditional machine-learning approaches primarily focus on isolated node features, often overlooking the broader context of interconnected nodes. Graph-based techniques, particularly Graph Neural Networks (GNNs), have advanced the field by leveraging relational information effectively. However, in real-world spoofing detection datasets, trading behaviors exhibit dynamic, irregular patterns. Existing spoofing detection methods, though effective in some scenarios, struggle to capture the complexity of dynamic and diverse, evolving inter-node relationships. To address these challenges, we propose a novel framework called the Generative Dynamic Graph Model (GDGM), which models dynamic trading behaviors and the relationships among nodes to learn representations for conspiracy spoofing detection. Specifically, our approach incorporates the generative dynamic latent space to capture the temporal patterns and evolving market conditions. Raw trading data is first converted into time-stamped sequences. Then we model trading behaviors using the neural ordinary differential equations and gated recurrent units, to generate the representation incorporating temporal dynamics of spoofing patterns. Furthermore, pseudo-label generation and heterogeneous aggregation techniques are employed to gather relevant information and enhance the detection performance for conspiratorial spoofing behaviors. Experiments conducted on spoofing detection datasets demonstrate that our approach outperforms state-of-the-art models in detection accuracy. Additionally, our spoofing detection system has been successfully deployed in one of the largest global trading markets, further validating the practical applicability and performance of the proposed method.

Paper Structure

This paper contains 24 sections, 20 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: A typical example of spoofing transactions. Traders place deceptive sell (ask) or buy (bid) orders without execution to influence the market price by misleading other traders about the market demand or supply.
  • Figure 2: The proposed Generative Dynamic Graph Model (GDGM) architecture for Conspiracy Spoofing Detection. The first part is the historical transaction encoding of input time series data, which builds the embedding of irregular transaction series. The second part is the temporal graph attention. The third part is the heterogeneous graph attention layer, which aggregates the information for different types of neighbors. The fourth part is the classification layer, which is a multi-layer perception that gives the prediction of whether this transaction is spoofing.
  • Figure 3: The experimental results of spoofing detection methods under queue-based study. During the experiment, pre-trained models were employed to detect spoofing transactions over four weeks. At the end of each four-week interval, newly observed cases were merged with the historically labeled database, and the models were retrained to improve their performance.
  • Figure 4: AUC of our method, in terms of threshold $z$, dimension of encoding output $h$, dimension of the attention vector $q$, and number of heterogeneous aggregation layers.
  • Figure 5: Performance comparison for models with different generative data encoding modules. We fix the trained model and make predictions for the next 12 weeks.