Table of Contents
Fetching ...

Detecting Multilevel Manipulation from Limit Order Book via Cascaded Contrastive Representation Learning

Yushi Lin, Peng Yang

TL;DR

This paper tackles the detection of multilevel spoofing in high-frequency markets by leveraging the hierarchical structure of limit order book data. It introduces a cascaded LOB representation framework built on a Transformer-based Multilevel LOB Encoder and a supervised contrastive learning objective, paired with manual features, to produce discriminative latent representations for anomaly detection. Across multiple representation models and detectors, the approach yields consistent performance gains, with Transformer-based architectures achieving state-of-the-art results, and ablation studies clarifying the contributions of the cascaded architecture and the hybrid loss under limited anomaly supervision. The work highlights the inherent difficulty of multilevel manipulation versus single-level patterns and demonstrates the practical value of exploiting LOB hierarchy for robust, scalable market manipulation detection.

Abstract

Trade-based manipulation (TBM) undermines the fairness and stability of financial markets drastically. Spoofing, one of the most covert and deceptive TBM strategies, exhibits complex anomaly patterns across multilevel prices, while often being simplified as a single-level manipulation. These patterns are usually concealed within the rich, hierarchical information of the Limit Order Book (LOB), which is challenging to leverage due to high dimensionality and noise. To address this, we propose a representation learning framework combining a cascaded LOB representation architecture with supervised contrastive learning. Extensive experiments demonstrate that our framework consistently improves detection performance across diverse models, with Transformer-based architectures achieving state-of-the-art results. In addition, we conduct systematic analyses and ablation studies to investigate multilevel manipulation and the contributions of key components for detection, offering broader insights into representation learning and anomaly detection for complex time series data.

Detecting Multilevel Manipulation from Limit Order Book via Cascaded Contrastive Representation Learning

TL;DR

This paper tackles the detection of multilevel spoofing in high-frequency markets by leveraging the hierarchical structure of limit order book data. It introduces a cascaded LOB representation framework built on a Transformer-based Multilevel LOB Encoder and a supervised contrastive learning objective, paired with manual features, to produce discriminative latent representations for anomaly detection. Across multiple representation models and detectors, the approach yields consistent performance gains, with Transformer-based architectures achieving state-of-the-art results, and ablation studies clarifying the contributions of the cascaded architecture and the hybrid loss under limited anomaly supervision. The work highlights the inherent difficulty of multilevel manipulation versus single-level patterns and demonstrates the practical value of exploiting LOB hierarchy for robust, scalable market manipulation detection.

Abstract

Trade-based manipulation (TBM) undermines the fairness and stability of financial markets drastically. Spoofing, one of the most covert and deceptive TBM strategies, exhibits complex anomaly patterns across multilevel prices, while often being simplified as a single-level manipulation. These patterns are usually concealed within the rich, hierarchical information of the Limit Order Book (LOB), which is challenging to leverage due to high dimensionality and noise. To address this, we propose a representation learning framework combining a cascaded LOB representation architecture with supervised contrastive learning. Extensive experiments demonstrate that our framework consistently improves detection performance across diverse models, with Transformer-based architectures achieving state-of-the-art results. In addition, we conduct systematic analyses and ablation studies to investigate multilevel manipulation and the contributions of key components for detection, offering broader insights into representation learning and anomaly detection for complex time series data.

Paper Structure

This paper contains 34 sections, 4 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: The overall architecture of the decoupled framework for multilevel manipulation detection, consisting of two core stages—representation and detection (Eqs. \ref{['eq:representation']} and \ref{['eq:detection']}).
  • Figure 2: Impact of anomaly insertion depth on AUC-PR.
  • Figure 3: AUC-PR performance comparison with different loss functions and input on multilevel manipulation detection (OC-SVM): evaluated on all detected anomalies across five levels (left) and detected anomalies limited to levels 2–5 (right).
  • Figure 4: Ablation study on the cascaded architecture using JFDS with OC-SVM.
  • Figure 5: The overall pipeline of data preparation for the multilevel manipulation detection task.